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editor's letter 


An Issue of Teamwork 


After all, I’ve been part of that monthly 
process since Bill Gates earned his first 
billion. One column, coming up. 

So, why have I been staring at a blank 
screen the last three days? The reason, 


I must admit, is the same reason I’ve | 


enjoyed this work for so long. The real 
beauty of working on ACM’s flagship 
publication is that nothing is ever the 
same. Unlike the majority of commer- 
cial and trade magazines, Communica- 
tions rarely covers the same topic twice. 


Each issue truly follows a unique path | 


to fruition. 

In an effort to impart some sense 
of the production process, let’s take 
this issue as an example. Many of the 
seeds for this edition were planted by 
mid-2009. Our cover story on quantum 
algorithms was invited by the EiC, but 
often the articles and viewpoints have 
beensubmitted and accepted afterrig- 
orous review by members of Commu- 


nications Editorial Board (for details, | 


see Author Guidelines http://cacm. 
acm.org/about-communications/ 
author-center/author-guidelines). 
The News Board, working with Senior 
Editor Jack Rosenberger, teleconfer- 
ence monthly to determine the most 


important and timely stories to pur- | 


sue for each issue. Managing Editor 
Tom Lambert works with the View- 
points Board to ensure that section 
presents a compelling collection of 
columnists and commentary, deliver- 
ing many diverse views on science and 
industry practices from around the 


world. Monthly features like CACM | 


Online and Last Byte are also spirited 
by HQ editors. 


The editorial lineup for each issue is 
a collaborative effort by the EiC, Board 
section chairs, and HQ staff. The EiC 
has the final word on all things edito- 
rial and makes the ultimate decision 
on the cover story and cover image. In 
a perfect world, all the manuscripts 
slated for each issue (Save news stories) 
should be in the hands of HQ editors 
eight weeks prior to publication. That’s 
when the real production cycle begins. 


Pulling the Pieces Together 

While the editorial elements have had 
months to simmer, the cover image 
and assorted graphical complements 
throughout this issue has only a few 
weeks to craft. The goal of the edito- 
rial artwork is to draw readers into an 
article by reflecting the subject in an 
intriguing way or by adding a creative 
spin to the subject. Given the assort- 
ment of topics in any edition, this is no 
small task. Therefore, one of the first 
steps in the process is a meeting with 
Group Publisher Scott Delman, staff 


| editors, Art Director Andrij Borys, and 


Associate Art Director Alicia Kubista. 
Editors relay the gist of each article 
and volley ideas on how to best repre- 
sent them graphically. Cover ideas take 
shape and artists and photographers 
are tapped. Over the weeks we watch 
early sketches turn into colorful works. 
Cover concepts can go through many 
iterations; often changing direction 
and detail as they progress. 


bert, Rosenberger, and Senior Editor 
Andrew Rosenbloom are editing ev- 


ery manuscript slated for the month. 


FEBRUARY 2010 


Working closely with the authors, their 
goal is to help make each article crisp 
and tight. Edited articles are sent to 
Assistant Art Director Mia Angelica 
Balaquiot, who flows the text into page 
galleys and redraws every submitted 
figure and table accompanying those 
articles to ensure a professional uni- 
form presentation. Galleys are sent to 
the contact author for each article for 
one last look prior to publication. All 
final editorial tweaks are made; the de- 
sign phase then takes over. 

This is the juncture where editorial, 
artwork, and over 100 empty magazine 
pages merge. Borys and his team work 
under the ever-looming deadline to 
pull all the pieces together into a cohe- 
sive unit that flows from news stories, 
to viewpoints, to feature articles, to 
research, with lots of eye-catching ele- 
ments in between. Staff editors read 
over every page one last time. 

Advertising Sales Coordinator Jen- 
nifer Ruzicka submits the ad pages 
slated for the issue, and Production 
Manager Lynn D’Addesio coordinates 
every move between HQ and the print- 
ing facilities in Richmond, VA, where 
the final product is ultimately shipped. 

As soon as one issue of Communi- 
cations is “put to bed,” we all move to 
the next one. There’s art to discuss, 
authors to contact, and deadlines, al- 
ways deadlines. The process may not 


_ always run this smoothly (ah, perfect 
| world), but each issue always reflects 
While artwork is germinating, Lam- 


a total team effort, and it is always a 
privilege to be part of that team. 


Diane Crawford, EXECUTIVE EDITOR 
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In the Virtual Extension 


Communications’ Virtual Extension brings more quality articles to ACM members. 
These articles are now available in the ACM Digital Library. 


Reversing the Landslide 
in Computer-Related Degree 
Programs 


Irma Becerra-Fernandez, Joyce Elam, 
and Susan Clemmons 


Undergraduate and graduate enrollment 

in computer information system (CIS)- 
related coursework has been on a steady 
decline. The number of bachelor’s degrees 
in computer science has fallen dramatically 
since 2004; a similar trend is also affecting 
academic programs that combine business 
and IT education. Rumors of CIS faculty 
layoffs compound the grim job outlook of 
CIS graduates, who fear not being able to 
find jobs in a market already plagued by 
challenges brought about by the dot com 
demise and IT outsourcing. How can CIS 
programs survive? This article details some 
successful intervention strategies that can 
be implemented to weather the impending 
crisis, and details how these interventions 
helped one institution reverse the downturn 
and reinvent the image of its CIS programs. 


Practical Intelligence in IT: 
Assessing Soft Skills 
of IT Professionals 


Damien Joseph, Soon Ang, Roger H.L. 
Change, and Sandra A. Slaughter 


What qualities make a successful IT 
professional? This study develops and 
tests a measure of soft skills or practical 
intelligence of IT professionals, defined 
as intrapersonal and interpersonal 
strategies for managing self, career, and 
others. The instrument—SoftSkills for 

IT (SSIT)—elicits individuals’ responses 
to IT work-related incidents and was 
administered to practicing IT professionals 
and inexperienced IT undergraduates. 
Results indicate that practical intelligence 
is measurable and SSIT discriminates 
between experienced and inexperienced 
IT professionals. This study concludes 

by identifying practical implications for 
selection, training, and development and 
proposes future research directions on 
assessing practical intelligence. 


Wireless Insecurity: 

Examining User Security Behavior 
on Public Networks 

Tim Chenoweth, Robert Minch, 

and Sharon Tabor 


Wireless networks are becoming 
ubiquitous but often leave users 


| responsible for their own security. The 
| authors study whether users are securing 


their computers when using wireless 
networks. Automated techniques are 
used that scan users’ machines after 


_ they associate with a university wireless 


network. Results show that over 9% of 3,331 
unique computers scanned were not using 
a properly configured firewall. In addition, 


| almost 9% had at least one TCP port open, 


with almost 6% having open ports with 
significant security implications. The 
authors also discuss cases where connected 
computers were compromised by Trojan 


programs, such as SubSeven and NetBus. 


Informatics Creativity: 
A Role for Abductive Reasoning? 
John Minor Ross 


Analysts and programmers may be stymied 
when faced with novel tasks that seem 
beyond the reach of prior education and 
experience. Sometimes a solution appears, 
however, while considering something else 
seemingly altogether unrelated. Coming 


| up with new ideas or creative approaches 


to overcome such problems may be less 
daunting once the role of adductive 
reasoning is considered. 


Designs for Effective 
Implementation of Trust 
Assurances in Internet Stores 
Dongmin Kim and Izak Benbasat 


A study of 85 online stories offers a 
snapshot of how often Internet stores 

use trust assurances and what concerns 
they address. These findings will help 
business managers understand how other 
companies use trust assurances and help 
identify what can be improved within 
their organizations. For example, the 
authors determine that about 38% of total 


| assurances were delivered in an ineffective 


way, which might cause shopping-cart- 
abandonment problems. The article offers 
design guidelines for trust assurances for 
Web developers based on the authors’ 
analysis and previous studies. 


| Taking a Flexible Approach to ASPS 
| Farheen Altaf and David Schuff 


There has been a recent revival of the 
ASP model through the notion of cloud 
computing and “software as a service.” 
The purpose of this article is to better 
comprehend the Small to Medium 
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| Enterprise (SME) market for ASPs through 


an analysis of the factors that are most 
important to likely adopters. Through a 
survey of 101 SMEs, the authors find that 
cost, financial stability, reliability, and 
flexibility are all significantly associated 
with self-assessed likelihood of ASP 
adoption. Surprisingly, flexibility was 
negatively associated with likelihood of 
adoption, possibly indicating a perception 


| that ASPs are not sought for their flexibility. 


Managing a Corporate 
Open Source Software Asset 


Vijay K. Gurbani, Anita Garvert, and 
James D. Herbsleb 


Corporations have used open source 
software for a long time. But, cana 
corporation internally develop its software 
using the open source development 
models? It may seem that open source style 
development—using informal processes, 
voluntary assignment to tasks, and having 
few financial incentives—may not bea 
good match for commercial environments. 
This ongoing work demonstrates 

that under the right circumstances, 
corporations can indeed benefit from 
adopting open source development 
methodologies. This article presents 
findings on how corporations can structure 


_ software teams to succeed in developing 


commercial software using the open- 
source software development model. 


Takes Two to Tango: How 
Relational Investments Improve 
IT Outsourcing Partnerships 


| Nikhil Mehta and Anju Mehta 


As the recent economic crisis has 

shown, client-vendor partnership 

can quickly regress into a contractual 
arrangement with a primitive cost-cutting 
objective. Based on interviews with 21 
vendor executives in India, the authors 
recommend that clients with a long-term 
vision for their IT outsourcing function 
may do well by developing mature 
partnerships with their vendors. To address 
the general lack of understanding about 
how to make it happen, the authors make 
provisional recommendations advising 
clients to make relational investments in 
selective areas. Key client benefits of such 
investments include improved service 
quality, cost savings, improved vendor 
sensitivity toward information security and 
privacy, and improved vendor capabilities 
to fulfill client’s future IT needs. 


COMMUNICATIONS OF THE ACM 9 


BLOG (G)CACM 


The Communications Web site, http://cacm.acm.org, 
features more than a dozen bloggers in the BLOG@CACM 
community. In each issue of Communications, we'll publish 
excerpts from selected posts. 


twitter 


Follow us on Twitter at http://twitter.com/blogCACM 


DOI:10.1145/1646353.1646358 


http://cacm.acm.org/blogs/blog-cacm 


Connecting Women 
and Technology 


Guest blogger Valerie Barr writes about highlights of the ninth Grace 
Hopper Celebration of Women in Computing Conference, including 
keynote speeches by Megan Smith and Francine Berman. 


From “Grace Hopper 
Conference Opening 
Session: Part 1” 
http://cacm.acm.org/ 
blogs/blog-cacm/43989 

'/ The theme of the ninth 
Grace Hopper Celebration (GHC) of 


Women in Computing is “Creating | 
Technology for Social Good.” This is a | 


theme that has clearly resonated with 


many people as the conference totally | 
sold out, with 1,608 attendees! There | 


are 178 companies represented, 23 
countries, and 728 students. There 


were more than 100 people who vol- | 


unteered for the 16 committees that 
helped organize different aspects of 
the conference. One- quarter of the 
attendees, 430 people, are involved in 
presentations of panels, papers, work- 
shops, and Birds of a Feather sessions. 
In addition to the usual conference 
type of activities, one of the sessions 
on Wednesday was a resumé review— 
the volunteer reviewers read more than 
300 resumés. 

A wonderful element of GHC is the 
emphasis on networking. At the con- 
ference opening on Thursday, Heidi 
Kvinge of Intel, the conference chair, 
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challenged attendees to make the 
most of this aspect of the conference 
by introducing themselves to at least 


| five new people per day. For the un- 


dergraduates in particular, Heidi gave 
a wonderful example of an elevator 
speech, demonstrating how they could 
capture all the key details about them- 
selves in just a few sentences. 

Heidi also acknowledged the sup- 
port of SAP, which sponsors videoing 
at the conference. She showed the “I 
Am A Technical Woman” video that 


was made at GHC last year, which you | 


can view at http://www.anitaborg.org/ 


news/video. This is a great way to get | 


a sense of what GHC is like, to under- 
stand the incredible energy at the con- 
ference. As per Heidi’s request, please 


pass this video on to your friends and | 


colleagues, and to anyone you know 
who has a daughter. 


From “Grace Hopper 

Keynote 1: Megan Smith” 
http://cacm.acm.org/ 
blogs/blog-cacim/44258 

Thursday’s keynote address was by 
Megan Smith, vice president of new 
business development and general 
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manager of Google.org. She has been 
at Google since 2003 and oversaw the 
acquisitions that resulted in Google 
Earth and Google Maps. In her talk Me- 
gan focused on the interconnectedness 


_ of CS, using four examples of areas that 


demonstrate this. 

1. Interconnectedness of people around 
the world: When you look at Google que- 
ry traffic worldwide, you see that there 


| is almost no query traffic from Africa, 


though there is increasing SMS activ- 
ity. For example, M-Pesa is a service in 
Kenya that allows full telephony-based 
money transfer. But the “real” commer- 


| cial Internet is coming to Africa. Google 


is opening five new offices in Africa, 
bringing its total to seven. It will be do- 
ing maps and supporting all the “usual” 
Google apps, working with the Grameen 
AppLab, working on health-related ap- 
plications, building on existing SMS ef- 
forts, and working to get NGO informa- 
tion on the Web. 

2. Interconnectedness of data: People 
at Google have been generating real- 
time information about the spread of 
flu. They have used search logs to pre- 
dict flu rates, based on the belief that 
the first thing people do when they get 
sick is start searching the Web. It turns 
out that they are 89% accurate on sea- 
sonal flu rates, based on verification 


| with U.S. Center for Disease Control 


and Prevention (CDC) data. The benefit 
of this data-mining work is that Google 
can actually give the CDC real-time in- 
formation more quickly than the CDC 
gets it from doctors and hospitals. 
Next, Google is working to get these ap- 
plications into multiple languages— 


it turns out that in many parts of the 
world it is becoming cheaper to collect 
data digitally than on paper, so the de- 
veloping world can begin to move in 


this direction as well, using the data | 
mining of digital data to gain informa- | 


tion on trends. 

3. Civil liberties: Events in Iran and 
Colombia have demonstrated the use 
of technology to mobilize people. The 
Alliance of Youth Movements Summit, 
held last year in New York City and soon 
to be held again, taught people how to 
create youth groups, and heavily uti- 
lized Webcasts and Facebook. Megan 
discussed the role that technology can 
play for people in “extreme” situations 
such as how SMS alerts can be used in 
parts of Africa to warn women about 


safe travel routes. She argued that tech- | 


nology can help speed up the improve- 
ment of life, particularly for women, in 
some parts of the world where there is 
still great danger. She also discussed 
the potential for improving education, 
such as creating opportunities for col- 
laboration between schools across geo- 
graphic and economic divides. 

4, The environment: There are many 
CS opportunities in building the con- 
trol systems involved for new energy- 
delivery approaches. For example, So- 
larBox is an application that will help 
groups of people organize to increase 


their neighborhood. Google’s Power- 
Meter application will help people see 
power usage in their home. Studies 


show that once people know how much | 


energy they are using, they usually de- 
crease usage by 5%-15%. 
Megan closed by saying that the 21st 


century will be all about these kinds of | 


interconnectedness, and that there are 
many, many opportunities for people 
in CS to work on exciting, interesting, 
and relevant projects. 


From “Grace Hopper 

Keynote 2: Fran Berman” 
http://cacm.acm.org/ 
blogs/blog-cacm/44532 

The second keynote speaker was Fran 
Berman, vice president for research at 


Rensselaer Polytechnic Institute. Fran | 


was formerly director of San Diego Su- 
percomputer Center and has worked 


for years in the design and develop- | 


ment of a national-scale cyberinfra- 
structure. 


Fran’s talk was entitled “Creating | 


Technology for the Social Good: A Pro- 
logue.” Her basic message was that 
science, engineering, and technology 
really matter when it comes to address- 
ing and solving the most pressing prob- 


| lems facing society today. 


As an example of a problem, and a 
solution born out of technology, she 
briefly discussed the area of safer en- 
vironments through earthquake pre- 
diction. Basically, computer models 
are being developed to predict seismic 
activity. These models are then run 


on supercomputers, which generate | 


output in the form of seismic predic- 
tions, showing where seismic activity 


will occur and how long it will last af- 


ter an initial quake. This information 
is being used to develop new building 
codes, better disaster-response plans, 
and targeted retrofitting of older con- 
struction. Other examples Fran cited 


are the OLPC project to bring comput- | 


ers to children in the developing world 


and iRobot, which is developing robots | 


suited for dangerous situations so that 
humans don’t have to be exposed to 
danger and risk. 

But Fran argues there is a major 
area that we have to address as the 
“prologue” to effectively addressing 


the large problems. That issue is data. | 
| We have to harness data, so that we can 
their buying power of solar panels in | 


turn it into information and knowl- 
edge. This will help us create a strong 
foundation for efforts driven by science 
and engineering. 

Electronic data is fragile. Much of it, 


such as wikis and Web sites, disappears | 


quickly or is changed often. And there’s 
a lot of it! There is currently more than 


a zettabyte of data. The U.S. Library | 


of Congress alone has more than 295 
terabytes of data. We are running out 
of room in which to store it all, so we 
have to be cognizant of the data life cy- 
cle and look at ways in which computer 
scientists can support the data life cy- 


cle. But we also have to recognize that | 


the CS view of data is different than a 


librarian’s view of data which, in turn, | 


is different than an individual user’s 
view of data. 

So the key questions we need to 
think about are: What should we save? 
How should we save it? Who should 
pay for it? 


Addressing these questions now is | 


part of the process of creating a strong 
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foundation for the technology work we 
will be doing in the years to come. Fran 
pointed out that we have to prepare 
today’s students with technical skills, 
but that they also have to be prepared 
to understand international cultures, 


_ business, politics, and policy. Only then 


will they be ready to take on leadership 
roles in the years to come. Fran closed 
by saying that to create positive change 
we have to ask the hard questions, par- 
ticularly about the representation of 
women and minorities in CS; create 
goals and metrics of success, and then 
hold people to them; publicly recog- 
nize the successes of our colleagues 
and students; and, when possible, use 
our role to create policy, set priorities, 
and handle resource allocation. 


From “Final Thoughts About 

Grace Hopper Conference” 
http://cacm.acm.org/ 
blogs/blog-cacm/44533 

My wrap-up from Grace Hopper— 
some Web sites and information about 
women and technology worldwide, 
much of it gleaned during the session 
“The ‘F’ Word: Feminism and Technol- 
ogy.” The repeated message was that 
we have to see technology as a means to 
an end, not an end itself. If we want to 
build technology to help women, par- 
ticularly in the developing world, we 
have to have the relevant context and 
involve women themselves in the de- 
velopment process. For example, in ru- 
ral Pakistan the majority of women are 
illiterate, so a text-based Internet tool 
is useless. But an audiovisual medium, 
like one that is currently being used to 
provide information about health-care 
services, will be much more successful. 
While in the developed world we seem 
to always think of a computer solu- 
tion, usually Web-based, to problems, 
these days the technology that will help 
women is most likely to involve mobile 
phones. This has been demonstrated 
in Africa by the Advancement through 
Interactive Radio project in which mo- 
bile phone technology allows women 
to participate in call-in programs on 
TV and radio, giving them a voice in 
community affairs which they had not 
previously had. 


Valerie Barr is the chair of the computer science 
department at Union College. 
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Where the Data Is 


The vast Internet delivers only a sliver of the information the average American 
consumes each day, according to a recent report by the University of California, 
San Diego (http://hmi.ucsd.edu/howmuchinfo_research_report_consum.php). 
Less than 1% of the daily data diet comes from Web browsing, way behind the 
top three providers—video games, television, and movies, says Roger E. Bohn and 
James E. Short in “How Much Information? 2009 Report on American Consum- 
ers.” That improbable finding is due to a statistical slight of hand that measures 
information consumption in bytes, giving streaming video supersize status. The 
numbers are less extreme though still slanted when Internet information is mea- 
sured in hours (15.6% of total versus 75.1% for moving-picture media) or words 
(24.7% versus 47.6%). They are skewed further because they only measure infor- 
mation consumed at home, not work. 

The report is full of extravagant, oddball stats: Americans consumed 1.3 tril- 
lion hours of information in 2008; those 3.6 zettabytes are 20 times more than 
could be stored at one time on all the world’s hard drives. Some 70% of adults play 
computer games; 10 million subscribers watched 3.6 hours of video on average 
per month on mobile phones; and gaming PCs occupied less than 2% of data con- 
sumption hours but almost 40% of total bytes. And on it goes; data devoid of con- 
text. What else to make of its comparison of Lincoln’s Gettysburg Address (1,293 
bytes) with an episode of NBC’s “Heroes” (10GB)? 

The report sidesteps 
assigning “value” to in- 
formation and _ allows 
that the Internet “pro- 
vides a substantial por- 
tion of some kinds of 
information” but misses 
a key distinction: mov- 
ies and computer games 
aren’t consumed as a 
source of data. They are 
junk food, M&Ms to the 
Web’s square meals. Yes, 
the Web also serves emp- 
ty calories, and TV can 
serve healthy fare. But 
people do not use video- 
based media to finish 
work, communicate with friends, and help make decisions. The Internet plays all 
these roles. It is the primary source of news for 40% of adults (http://people-press. 
org/report/479/internet-overtakes-newspapers-as-news-source). Some 80% use 
it to socialize (http://www.ruderfinn.com/about/news/rf-s-new-study-of.html). It 
is a source of information on medical conditions and treatments for 61% of U.S. 
adults (http://www. pewinternet.org/Reports/2009/8-The-Social-Life-of-Health-In- 
formation.aspx). And 53% find help for financial decisions (http://www.ebri.org/ 
pdf/surveys/mrces/2007_factsheet_1.pdf) online. 

Serious stuff, but hard to see amidst the 34GB you take in each day. 


FEBRUARY 2010 


cacm online 


ACM 
| Member 
| News 


FIRST-EVER CS EDUCATION 
WEEK A SUCCESS 
ACM and a handful of partners 
launched Computer Science 
Education Week in the U.S., 
which received a great deal 
of media attention and was 
celebrated at schools and 
businesses throughout the 
nation. Held from Dec. 5-11, 
the central hub for Computer 
Science Education Week was 
http://www.csedweek.org, which 
featured curriculum guides, 
| posters, and research, and also 
provided a platform for users to 
share ideas through interactive 
links at Facebook, YouTube, and 
Twitter. 
Computer Science Education 
Week received favorable 
media attention in The New 
York Times, The Washington 
Post, and elsewhere. Andina 
three-minute video posted on 
YouTube (http://www.youtube. 
com/CSEdWeek) ACM CEO John 
| White discussed the importance 
of computer science, noting that 
“Computing is fueling countless 
| advances, from improving 
communications and advancing 
health care to protecting 
national security and improving 
energy efficiency to helping 
| understand the depths of the 
universe.” 


SIGSCE 2010 

The theme of SIGCSE 2010 is 
“Making Contact,” and the 41st 
| ACM Technical Symposium on 
Computer Science Education 
will continue SIGCSE’s tradition 
of bringing together colleagues 
from around the world to make 
contact with fellow educators 
via panels, poster and special 
sessions, workshops, Birds of a 
Feather sessions, and informal 
settings. 

The keynote speakers are 
Sally Fincher of the University of 
Kent who will speak about useful 
sharing; Carl E. Wieman of the 
University of British Columbia 
who will discuss science 
education for the 21st century; 
and Intel’s Michael Wrinn of 
Intel who will talk about parallel 
computing and industry and 
academic collaboration. SIGCSE 
2010 will be held from March 
10-13 in Milwaukee, WI. For 
more information, visit http:// 
sigcse.org/sigcse2010. 
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Alternate Interface 


Technologies Emerge 


Researchers working in human-computer interaction are 
developing new interfaces to produce greater efficiencies in personal 
computing and enhance miniaturization in mobile devices. 


ARDWARE 


cessing power into smaller 


ray of possibilities that re- 


searchers say will lead to human-com- | 


puter interfaces that are more natural 
and efficient than the traditional hall- 
marks of personal computing. These 


smaller designs have given rise to new | 
mobile platforms, where the barrier to | 


further miniaturization no longer is 
the hardware itself but rather humans’ 
ability to interact with it. Researchers 
working in human-computer interac- 
tion (HCI) are dedicating effort in both 
areas, developing interfaces that they 
say will unlock greater efficiencies and 
designing new input mechanisms to 
eliminate some of the ergonomic bar- 
riers to further miniaturization in mo- 
bile technology. 


Patrick Baudisch, a computer sci- | 


ence professor at Hasso Plattner In- 


stitute in Potsdam, Germany, points | 


out that there are two general ap- 
proaches to HCI, a field that draws on 
computer science, engineering, psy- 
chology, physics, and several design 
disciplines. One approach focuses on 
creating powerful but not always total- 


ENGINEERS CON- | 
TINUE to pack more pro- | 


designs, opening up an ar- | 


ly reliable interfaces, such as speech 
or gesture input. The other focuses on 
creating less complex, more reliable 


approach, Baudisch argues that inter- 
faces developed with simplicity and 
_ reliability in mind facilitate an unin- 
| terrupted engagement with the task at 
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NanoTouch, a back-of-device input technology for very small screens on mobile devices and 
electronic jewelry. The technology demonstrated here by Patrick Baudisch was developed at 
Microsoft Research and Hasso Plattner Institute. 


hand, increasing the opportunity for 
users to experience what psychologist 
Mihaly Csikszentmihalyi calls “opti- 
mal experience” or “flow.” 

Baudisch began his HCI career in 
the Large Display User Experience 
group at Microsoft Research, where he 
focused on how users could interact 
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more effectively with wall displays and 
other large-format technologies that 
render traditional input techniques 


nearly useless. In his current work at 


the Hasso Plattner Institute, Baudisch 
focuses on projects designed to facili- 
tate the transition from desktop to mo- 
bile computing. “There is a single true 
computation platform for the masses 
today,” he says. “It is not the PC and not 
One Laptop Per Child. It is the mobile 
phone—by orders of magnitude. This 
is the exciting and promising reality we 
need to design for.” 

One example of an interface tech- 
nology that Baudisch designed to fa- 
cilitate this transition to mobile com- 
puting is NanoTouch. While current 
mobile devices offer advanced capabil- 


ities, such as touch input, they must be | 


large enough to manipulate with fin- 


gers. The NanoTouch interface, which 
is designed to sidestep this physical 
constraint, makes the mobile device 
appear translucent and moves the 
touch input to the device’s back side 
so that the user’s fingers do not block 
the front display. Baudisch says Nano- 
Touch eliminates the requirement to 
build interface controls large enough 
for big fingertips and makes it possible 
to interact with devices much smaller 
than today’s handheld computers and 
smartphones. 

In his most recent project, called 
RidgePad, Baudisch is working on a 
way to improve the recognition ac- 
curacy of touch screens. By monitor- 
ing not only the contact area between 
finger and screen, but also the user’s 
fingerprint within that contact area, 
RidgePad reconstructs the exact angle 


Caen 


Interpolating force sensitive resistance (IFSR), a multitouch input technology developed at 
New York University’s Media Research Lab. In this demo, Ilya Rosenberg demonstrates a 24- 


inch Touchco IFSR sensor that serves as an interactive desktop surface. 


ENTER PIN: **** 
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456 
789 


A prototype shape-shifting ATM display that can assume multiple graphical and tactile 
states. The display was developed at Carnegie Mellon University’s Human-Computer 
Interaction Institute. 
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at which the finger touches the display. 
This additional information allows for 
more specific touch calibration, and, 
according to Baudisch, can effectively 
double the accuracy of today’s touch 
technology. 


Increasing Human Performance 
Another HCI researcher focusing on 
new interface technologies for mo- 
bility is Carnegie Mellon University’s 
Chris Harrison, who points out that 
while computers have become orders 
of magnitude more powerful than they 
were a few decades ago, users continue 
to rely on the mouse and keyboard, 
technologies that are approximately 45 
and 150 years old, respectively. “That’s 
analogous to driving your car with 
ropes and sails,” he says. “It’s this huge 
disparity that gets me excited about in- 
put.” Harrison, a graduate student in 
CMU’s Human-Computer Interaction 
Institute, says that because computers 
have grown so powerful, humans are 
now the bottleneck in most operations. 
So the question for Harrison is how to 
leverage the excess computing power 
to increase human performance. 

For one of Harrison’s projects, in- 
creasing human performance benefit- 
ted from his observation that mobile 
devices frequently rest on large sur- 
faces: Why not use the large surfaces 
for input? This line of thinking was 
the birthplace for Harrison’s Scratch 
Input technology. The idea behind 
Scratch Input is that instead of pick- 
ing up your media player to change 
songs or adjust volume, the media 
player stays where it is but monitors 
acoustic information with a tiny, built- 
in microphone that listens to the table 
or desk surface. To change the volume 
or skip to the next track, for example, 
you simply run your fingernail over the 
surface of the table or desk using dif- 
ferent, recognizable scratch gestures. 
The media player captures the acous- 
tic information propagating through 


| the table’s surface and executes the 


appropriate command. 

In addition to developing Scratch 
Input, which Harrison says is now ma- 
ture enough to be incorporated into 
commercial products, he and his col- 
leagues have been working on mul- 


| titouch displays that can physically 


deform to simulate buttons, sliders, 
arrows, and keypads. “Regular touch 


TOM 


screens are great in that they can rer- | jae 


der a multitude of interfaces, but they 
require us to look at them,” says Har- 
rison. “You cannot touch type on your 
iPhone.” The idea with this interface 
technology, which Harrison calls a 
shape-shifting display, is to offer some 
of the flexibility of touch screens while 
retaining some of the beneficial tactile 
properties of physical interfaces. 
Another interface strategy designed 
to offer new advantages while retain- 
ing some of the benefits of older tech- 
nology is interpolating force-sensitive 
resistance (IFSR). Developed by two re- 


searchers at New York University, IFSR | 


sensors are based on a method called 
force-sensitive resistance, which has 
been used for three decades to create 
force-sensing buttons for many kinds 
of devices. However, until Ilya Rosen- 


berg and Ken Perlin collaborated on | 
the IFSR project, it was both difficult | — — 
_ other companies to integrate IFSR 


and expensive to capture the accurate 
position of multiple touches on a sur- 
face using traditional FSR technology 
alone. “What we created to address this 
limitation took its inspiration from hu- 
man skin, where the areas of sensitivity 


of touch receptors overlap, thereby al- | 


lowing for an accurate triangulation of 
the position of a touch,” says Perlin, a 
professor of computer science at NYU’s 
Media Research Lab. 

In IFSR sensors, each sensor ele- 
ment detects pressure in an area that 
overlaps with its neighboring ele- 
ments. By sampling the values from the 
touch array, and comparing the output 
of neighboring elements in software, 
Rosenberg and Perlin found they could 
track touch points with an accuracy 


approaching 150 dots per inch, more | 
than 25 times greater than the density | 
of the array itself. “In designing a new | 


kind of multitouch sensor, we real- 
ized from the outset how much more 
powerful a signal is when properly 
sampled,” says Rosenberg, a graduate 
student in NYU’s Media Research Lab. 
“So we aimed to build an input device 
that would be inherently anti-aliasing, 
down to the level of the hardware.” 
Recognizing the increased interest 
in flexible displays, electronic paper, 


and other technologies naturally suit- | 


ed to their core technology, Rosen- 


berg and Perlin spun off their sen- | 


sor technology into a startup called 
Touchco, and now are working with 


“There is a single 


true computation 


platform for the 
masses today,” says 
Patrick Baudisch, 
and it “is the mobile 
phone—by orders 

of magnitude. This 
is the exciting and 
promising reality we 
need to design for.” 


into large-touch screens and flexible 
electronic displays. In addition, the 
team is looking into uses as diverse 
as musical instruments, sports shoes, 
self-monitoring building structures, 
and hospital beds. 

“It seems that many of the hurdles 
are largely ones of cultural and eco- 
nomic inertia,” says Perlin. “When 


a fundamentally improved way of 


doing things appears, there can be 


significant time before its impact is | 


fully felt.” 

As for the future of these and other 
novel input technologies, users them- 
selves no doubt will have the final 
word in determining their utility. Still, 
researchers say that as input technolo- 
gies evolve, the recognizable mecha- 
nisms for interfacing with computers 
will likely vanish altogether and be 
incorporated directly into our environ- 
ment and perhaps even into our own 
bodies. “Just as we don’t think of, say, 
the result of LASIK surgery as an in- 
terface, the ultimate descendents of 


computer interfaces will be completely | 


invisible,” predicts Perlin. “They will 
be incorporated in our eyes as built- 
in displays, implanted in our ears as 
speakers that properly reconstruct 3D 
spatial sound, and in our fingertips as 
touch- or haptic-sensing enhancers 
and simulators.” 

On the way toward such seamlessly 
integrated technology, it’s likely that 


FEBRUARY 2010 


VOL. 53 


News 


new interface paradigms will con- 
tinue to proliferate, allowing for com- 
puter interactions far more sophisti- 
cated than the traditional mouse and 
keyboard. CMU’s Harrison predicts 
that eventually humans will be able 
to walk up to a computer, wave our 
hands, speak to it, stare at it, frown, 
laugh, and poke its buttons, all as a 


_ way to communicate with the device. 


In Harrison’s vision of this multimod- 
al interfacing, computers will be able 
to recognize nuanced human commu- 
nication, including voice tone, inflec- 
tion, and volume, and will be able to 
interpret a complex range of gestures, 


| eye movement, touch, and other cues. 


“If we ever hope for human-com- 
puter interaction to achieve the fluidity 
and expressiveness of human commu- 
nication, we need to be equally diverse 


_ in how we approach interface design,” 
| he says. Of course, not all tools and 


technologies will require a sophisticat- 
ed multimodal interface to be perfectly 
functional. “To advance to the next 


song on your portable music player, a 
| simple button can be fantastically ef- 
ficient,” says Harrison. “We have to be 
diligent in preserving what works, and 


investigate what doesn’t.” 
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Type Theory 
Comes of Age 


Type systems are moving beyond the realm of data structure 
and into more complex domains like security and networking. 


HEN THE PHILOSOPHER 
Bertrand Russell in- 
vented type theory at 
the beginning of the 
20th century, he could 
hardly have imagined that his solution 
to a simple logic paradox—defining 
the set of all sets not in themselves— 


would one day shape the trajectory of 


21st century computer science. 

Once the province of mathemati- 
cians and social scientists, type theory 
has gained momentum in recent years 
as a powerful tool for ensuring data 


consistency and error-free program | 


execution in modern commercial pro- 
gramming languages like C#, Java, 
Ruby, Haskell, and others. And thanks 
to recent innovations in the field, type 
systems are now moving beyond the 
realm of data structure and into more 
complex domains like security and net- 
working. 

First, a quick primer. In program- 
ming languages, a type constitutes a 
definition ofa set of values (for example, 
“all integers”), and the allowable op- 
erations on those values (for example, 
addition and multiplication). A type 
system ensures the correct behavior of 
any program routine by enforcing a set 
of predetermined behaviors. For exam- 


ple, in a multiplication routine, a type | 


system might guarantee that a program 
will only accept arguments in the form 
of numerical values. When other values 


appear—like a date or a text string— | 


the system will return an error. For 
programmers, type systems help pre- 
vent undetected execution errors. For 
language implementers, they optimize 
execution and storage efficiency. For ex- 
ample, in Java integers are represented 
in the form of 32 bits, while doubles 
are represented as 64 bits. So, when a 
Java routine multiplies two numbers, 
the type system guarantees they are ei- 
ther integers or doubles. Without that 
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Benjamin C. Pierce, 
University of Pennsylvania. 


| guarantee, the runtime would need to 
| conduct an expensive check to deter- 


mine what kinds of numbers were be- 
ing multiplied before it could complete 
the routine. 

What distinguishes a type system 
from more conventional program-level 
verification? First, a type system must 
be “decidable”; that is, the checking 
should happen mechanically at the ear- 


liest opportunity (although this does not | 


have to happen at compilation time; it 
can also be deferred to runtime). A type 


system should also be transparent; that | 


is to say, a programmer should be able 
to tell whether a program is valid or not 
regardless of the particular checking al- 


gorithm being used. Finally, a “sound” | 


type system prevents a program from 
performing any operation outside its 
semantics, like manipulating arbitrary 
memory locations. 

Languages without a sound type 


system are sometimes called unsafe or | 


weakly typed languages. Perhaps the 
best-known example of a weakly typed 
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system is C. While C does provide types, 
its type checking system has been inten- 
tionally compromised to provide direct 
access to low-level machine operations 
using arbitrary pointer arithmetic, cast- 
ing, and explicit allocation and deallo- 
cation. However, these maneuvers are 
fraught with risk, sometimes resulting 
in programs riddled with bugs like buf- 
fer overflows and dangling pointers that 
can cause security vulnerabilities. 

By contrast, languages like Java, 
C# , Ruby, Javascript, Python, ML, and 
Haskell are strongly typed (or “type 
safe”). Their sound type systems catch 
any type system violations as early as 
possible, freeing the programmer to 


| focus debugging efforts solely on valid 


program operations. 


Static and Dynamic Systems 


_ Broadly speaking, type systems come in 
_ two flavors: static and dynamic. Stati- 


cally typed languages catch almost all 
errors at compile time, while dynami- 
cally typed languages check most er- 
rors at runtime. The past 20 years have 
seen the dominance of statically typed 
languages like Java, C# , Scala, ML, and 
Haskell. In recent years, however, dy- 
namically typed languages like Scheme, 
Smalltalk, Ruby, Javascript, Lua, Perl, 
and Python have gained in popularity 
for their ease of extending programs at 
runtime by adding new code, new data, 
or even manipulating the type system at 
runtime. 

Statically typed languages have re- 
strictions and annotations that make 
it possible to check most type errors at 
compile time. The information used 
by the type checker can also be used by 
tools that help with program text-edit- 
ing and refactoring, which is a consid- 
erable advantage for large modular pro- 
grams. Moreover, static type systems 
enable change. For example, when an 
important data structure definition is 
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changed in a larger program, the type 
system will automatically point to all 
locations in the program that also need 
change. In a dynamically typed lan- 


guage it would be extremely difficult to — 
make such changes in larger programs | 


as it would be not known what other 
parts are affected by the change. On 
the other hand, some correct programs 
may be rejected by a static type system 
when it is not powerful enough to guar- 
antee soundness. 

In an effort to make static type sys- 
tems more flexible, researchers have 
developed a number of extensions like 
interface polymorphism, a popular ap- 
proach introduced by object-oriented 
languages like Simula, C++, Eiffel, Java, 
or C#. This method allows for inclusion 
between types, where types are seen as 
collections of values. So, an element of 
a subtype—say, a square—can be con- 
sidered as an element of its supertype— 
say, a polygon—thus allowing the ele- 
ments of different but related types to 
be used flexibly in different contexts. 

Another form of polymorphism, 
found in almost all programming lan- 
guages, is ad hoc polymorphism (also 
called overloading) where code be- 


haves in different ways depending on | 
the type. This approach has found its | 


fullest expression in Haskell, thanks in 
part to the efforts of Philip Wadler, pro- 
fessor of theoretical computer science 
at the University of Edinburgh. “When 
we designed Haskell, it quickly became 
clear that overloading was important 
and that there was no good solution,” 
says Wadler. “We needed overloading 
for equality, comparison, arithmetic, 
display, and input.” 

The Haskell system has evolved con- 
siderably over the years, thanks to the 
contributions of a far-flung group of 
contributors. “Once we’d come up with 
the initial idea of type classes, it led to 


a vast body of work, all sorts of clever | 


researchers coming up with neat exten- 
sions to the system, or applying it do 
things that we’d never thought it could 
do,” says Wadler. Today, Haskell ranks 
as the programming world’s premier 
case study in ad hoc polymorphism. 
The dream of unifying static and dy- 
namic type systems has long fascinated 
researchers. Today, several computer 
scientists are probing the possibility of 
merging these approaches. Wadler is 
pursuing a promising line of research 


a) 
The theory for | 
refinement types 
has existed for 

a long time, but 
recent progress in 
automatic theorem 
proving makes | 
refinement types 
suddenly practical. 


News 


says Benjamin C. Pierce, professor of 
computer science at the University of 
Pennsylvania, “but the formal methods 


| people aren’t interested in that. Today, 


they’re starting to meet in the middle.” 
Pierce points to refinement types, 
which are types qualified by a logi- 


_ cal constraint; an example is the type 


of even numbers, that is, the type of 
integers qualified by the is-an-even- 
number constraint. While the theory 
for refinement types has existed for 
a long time, only recent progress in 
automatic theorem proving makes re- 


| finement types suddenly practical. A 


promising security project was recently 
performed by Andrew D. Gordon, prin- 
cipal researcher at Microsoft Research 


| Cambridge, and colleagues. They add- 


called blame calculus that attempts to 
incorporate both static and dynamic | 
typing, while Erik Meijer, a language ar- 
chitect at Microsoft Research, proposes 
to use “static typing when possible, dy- | 
namic typing when necessary.” 


Security Type Systems 

In recent years, researchers have also 
been exploring type systems capable 
of capturing a greater range of pro- 
gramming errors such as the public 
exposure of private data. These emerg- 
ing type systems are known as security 
type systems. Whereas a traditional 
type system enforces rules by assign- 
ing values to data types, a security type 
system could apply the same principle 
of semantic checking to determine the 
owner of a particular piece of informa- 
tion. Those annotations could then 
help ensure the integrity of data flow- 
ing through the system. Two promis- 
ing security research projects include 
the AURA programming language, 
developed by Steve Zdancewic, asso- 
ciate professor of computer science 
at University of Pennsylvania, and Jif, 
a Java-based security-typed language 
developed by Andrew Myers, associate 
professor of computer science at Cor- 
nell University. 

Another interesting application of 
type checking involves hybridizing type 
systems and theorem provers. “His- 
torically, there have been two paral- 
lel tracks in the software engineering | 
world: type systems and theorem prov- 
ers. The type systems track has always 
emphasized lightweight methods,” 
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ed a system of refinement types to the 
F# programming language and were 
able to verify security properties of F# 
implementations of cryptographic pro- 
tocols by type checking. 

While type theory has matured con- 
siderably over the past 100 years, it still 
remains an active research arena for 
computer scientists. As type systems 
move beyond the realm of data consis- 
tency and into headier computational 
territories, the underlying principles 
of type theory are beginning to shape 
the way researchers think about pro- 
gram abstractions at a deep—even 
philosophical—level. Bertrand Russell 
would be proud. 
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Improving Disaster 
Management 


Social networking, sophisticated imaging, and dual-use technologies promise 
improved disaster management, but they must be adopted by governments 
and aid agencies if more lives are to be saved in the wake of crises. 


HEN THE SEPTEMBER 11, 
2001 terrorist attacks 
ripped through the 
heart of New York City, 
the July 7, 2005 suicide 
bombings created chaos and mayhem 
in central London, and the Boxing Day 
2004 Indian Ocean earthquake caused 
a tsunami that swept away more than 


200,000 lives, information and commu- 
nication technologies played a part in | 


disaster response. Communication was 
key, but not always possible as infra- 
structure collapsed and mobile phone 
networks became overloaded, prompt- 
ing renewed efforts to develop effective 
disaster management strategies. 
Research organizations, relief agen- 


cies, and technology providers agree | 


that technology can save lives in a di- 
saster, but here consensus ends, with 
a rift between researchers pursuing the 
possibilities of Web 2.0 applications 
and field workers largely committed to 
their traditional toolkit of mobile and 
satellite phones. 

“IT systems make it possible to han- 
dle large amounts of data to assess the 
situation after a disaster has struck, 
but we need to move in the direction of 
community response, developing me- 
dia centers that accommodate citizen 
journalists,” says Kathleen Tierney, 
professor of sociology and director of 
the Natural Hazards Research and Ap- 
plications Information Center at the 
University of Colorado, Boulder. “Of- 
ficial agencies need to interact with 
citizen first responders and assess in- 
formation being gathered in the field, 
rather than depending on information 
that is filtered through hierarchical or- 
ganizations. The information may not 
be 100% accurate, but mobile phone 
pictures taken at the scene provide the 
most rapid information.” 

Tierney is a proponent of Web 2.0 ap- 
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plications, such as Twitter, blogs, and 
wikis, as a means of improving disaster 
response. “People’s use of technology 
in crises is expanding rapidly, ahead 
of the use of technology by emergency 
management agencies,” she says. “We 
need people in these agencies who are 


disaster and technology savvy. We also | 


need opinion leaders and foundations 
that fund disaster assistance to think 
along new lines.” 

The concept of community response 
plays into the thesis of Ramesh Rao, 
director of the California Institute for 


Telecommunications and Information | 
Technology (Calit2) at the University of 


| he says. “There is a lack of informa- 


California, San Diego. Rao believes re- | 


search into technological, sociological, 
and organizational issues is critical to 
the improvement of disaster response. 
“There is a great opportunity to use 
technology to improve disaster man- 
agement, but at the moment people 
are in the way of that improvement,” 
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tion sharing, which is not a technol- 
ogy problem, but a problem stemming 
from organizations wanting things this 
Way.” 

One advance Rao suggests is dual 
use of technology. During peaceful 
times, dual-use technology, such as a 
mobile phone, operates as a everyday 
personal communications device, but 
during an emergency it transforms into 
an information sensor and dissemina- 
tor. This overcomes aversion to using 
different communications equipment 
during a crisis and eliminates the time 
lag caused by government agencies col- 
lecting, processing, and distributing 


| crisis-related data. Direct, firsthand 


reports from a disaster can provide a 
realistic picture, helping to avoid the 
confusion that can result from wide- 
spread, and not always accurate, televi- 
sion broadcasts. 

Proving the value of dual-use technol- 


AK 
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ogy, Calit2 has developed a peer-to-peer 
incident notification system that builds 
on the concept of human sensors. The 
human sensors collect and relay infor- 
mation about events, such as wildfires 
and traffic accidents, to first respond- 
ers and the general public using mobile 
phones. 

The notification system is available 
across all of California’s major cities 
and is based on speech recognition, al- 
lowing commuters to call in and report 
incidents, or call in and listen about 
events that could disrupt their travel. 
Content is self-regulated with users 
flagging incidents that are irrelevant 
or abusive, and the notification system 


includes algorithms that rate users who | 


report incidents. Conversely, the system 
can notify all users of an incident via a 
voice call or text message. 

“If you see an accident and call 911, 
the police come, the local radio station 
picks up what has happened and trans- 
mits the problem, but that’s often too 
late to allow commuters to get off the 
highway,” Ganz Chockalingam, princi- 
pal development engineer at Calit2 and 
developer of the notification system, ex- 
plains. “The incident notification system 
is successful because there is no middle 
man and no time delay. Typically, we get 
about 1,000 calls a day, but during the 
California wildfires [in 2009] that went 
up to about 10,000 calls a day.” 

Unlike traditional disaster manage- 
ment systems that are inflexible and 
constrained by capacity, this peer-to- 
peer system can scale to deliver real- 
time information during a disaster as 
there is no single channel of informa- 
tion and no single point of information 
control. The project is currently fund- 
ed by Calit2, although Chockalingam 
points out that costs are relatively low as 
much depends on user technology. 
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Disaster response 
needs to “move 
in the direction 


_of community 


response, developing 
media centers that 
accommodate citizen 
journalists,” says 
Kathleen Tierney. 


New Thinking, New Tools 

Stepping out of research labs and into 
the commercial world, disaster man- 
agement development follows the 


| path of improved global communica- 


tion and social networking. Image- 
Cat, a risk-management innovation 
company based in Long Beach, CA, 


| works with government, industry, and 


research organizations to develop new 
thinking, tools, and services that are 
available to both government agen- 
cies and businesses, such as insurance 


companies that need to estimate loss- | 


es after a disaster. 
ImageCat also concentrates on 
remote sensing and geographic in- 


| formation systems. One recent devel- 


opment is a virtual disaster viewer, a 


| Web-based system that uses remote 


sensors to gather information that 
can be displayed and used to assess 
the aftermath of disaster. The viewer 
is a social networking-type tool to 
which researchers and the public can 


News 


add information. It was tested during 
the 2008 earthquake in the Sichuan 
province of China, with about 100 en- 
gineers accessing before and after sat- 
ellite imagery to monitor the extent of 
damage in the region. 

“The viewer allows users to conduct 
a virtual disaster survey without leav- 


| ing their desks,” says ImageCat CEO 


Ronald Eguchi. “In the future, people 


_ at the scene who take pictures with 


mobile phones will be able to upload 
them to the viewer. We are in discus- 
sion with the United Nations about 


_ how it could use the viewer in disaster 


management.” 

With a myriad of sensors around 
the world and optical and radar satel- 
lite images of an adequate resolution 
to see people on the ground, the possi- 
bilities of gathering and sharing data, 
such as earthquake, coastal, and hur- 
ricane images, are almost boundless 
and could support significant humani- 
tarian relief efforts. 

The benefits of satellite images in 
disaster response are not limited to 
countries that own satellites, however, 
and disaster-struck countries can acti- 
vate a clause of the Charter of the Unit- 
ed Nations that requires imagery to be 
made available to a nation in distress. 

Eguchi believes much more can be 
done with technology to save lives in 
a disaster, but also recognizes the re- 
alities of aid agencies. “Technologies 
such as the virtual disaster viewer are 
new to aid agencies. Agencies need 
technology that is tested and validat- 
ed, that adds value to what they do and 
demonstrates efficiency in getting the 
right information to the right people at 
the right time,” he says. 

UNICEF workers say the best in-field 
technologies for disaster management 
are simple to use, low maintenance, 
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The designation “ACM Fellow” 
may be conferred upon those 
ACM members who have 
distinguished themselves by 
outstanding technical and 
professional achievements in 
informa-tion technology, who are 
current professional members of 


| ACM and have been professional 
members for the preceding five | 
years. Any professional member 
of ACM may nominate another 
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lightweight, and cheap. Reflecting 
Rao’s promotion of dual-use technolo- 
gies that require minimal training, 
UNICEF is pioneering RapidSMS and 
a field-based communications system 
called Bee. 

RapidSMS provides real-time trans- 
mission of data in a breaking emer- 
gency or in a long-standing disaster, al- 
lowing aid workers to monitor supplies 
and report on situations that require 
immediate response. At the moment, 
it is being used in a number of African 
countries to monitor nutrition, water, 
sanitation, and supply chains. 

UNICEF’s Bee is an open source 
emergency telecommunications  sys- 
tem that provides Internet access in ar- 
eas where infrastructure is nonexistent 


or unusable. It provides a telephony 


service and Wi-Fi access to applica- 
tions such as ones that monitor health 
and track supplies. The Bee system re- 
quires no tools, and can be installed by 
afield worker and be operational within 
30 minutes. Working with RapidSMS, 
it helps UNICEF provision supplies 
appropriately and gives workers im- 
mediate warnings of potential health 
risks and disease outbreaks. When an 
area stabilizes, the Bee system is left in 
place, acting as a base for a new com- 


munications infrastructure. 

International humanitarian aid 
agency Mercy Corps also focuses on 
communication. “The driving force 
behind all the equipment we carry is 
communications and we use the cheap- 
est services we can find,” says Richard 
Jacquot, a member of Mercy Corps’ 
global emergency operations team. 
“We all carry a BlackBerry, a cheap lo- 
cal mobile, a satellite phone such as the 
Inmarsat BGAN, sometimes a VHF ra- 
dio set and a laptop equipped with ap- 
plications, including Microsoft Office, 
email, and a Web browser. Where there 
is no network, or when a hostile govern- 
ment blocks access to local networks, 
we depend on satellite technology.” 

Far from the world of Web 2.0 col- 
laboration tools in the war-torn regions 
of Africa and Asia, Jacquot’s technology 
request is simple. “We have to carry too 
many tools, then we have to get autho- 
rization for them depending on where 


we are. Integrating everything into one 


device is too much to ask for, but some 
integration would be great,” he says. 
While those in the labs developing 
technology applications for disaster 
management and those in the field 
seeking simple, usable, and affordable 
solutions share the belief that technol- 


In Malawi, UNICEF’s RapidSMS system enables the registration of children and the monitoring 
of their nutritional status in an effort to stem the nation’s high infant mortality rate. 
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ogy can save more lives, it is important 
to not ignore the self-imposed threat 
created by technology development 
and nurtured by those who see it as a 
weapon with which to kill rather than a 
shield with which to protect. 

“In the 2008 Mumbai hotel bomb- 
ing, the terrorists were more effective 
in their use of technology than the offi- 
cials,” notes Rao. “They used informa- 
tion systems and cell phones, while the 
commandos sent in to clean up did not 
have cell phones, so they couldn’t com- 
municate with each other or people in 
the hotel. 

“Official organizations are slow and 
their thinking is ossified,” Rao says, 
“but that will change and technology 
will be better used to reduce the loss of 
lives in disaster management and sup- 
port quicker recovery of communities 
and infrastructure.” 
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News 


ACM Fellows Honored 


Forty-seven men and women are inducted as 2009 ACM Fellows. 


HE ACM FELLOW PROGRAM was 
established by Council in 1993 
to recognize and honor out- 


their achievements in com- 
puter science and information technol- 
ogy and for their significant contributions 
to the mission of the ACM. The ACM Fel- 
lows serve as distinguished colleagues 
to whom the ACM and its members look 
for guidance and leadership as the world 
of information technology evolves. 


The ACM Council endorsed the es- | 


tablishment of a Fellows Program and 
provided guidance to the ACM Fellows 


Committee, taking the view that the pro- | 


gram represents a concrete benefit to 
which any ACM member might aspire, 
and provides an important source of 
role models for existing and prospective 
ACM Members. The program is man- 
aged by the ACM Fellows Committee 
as part of the general ACM Awards pro- 
gram administered by Calvin C. Gotlieb 
and James J. Horning. For details on Fel- 
lows nominations, see p. 19. 

The men and women honored as 
ACM Fellows have made critical contri- 
butions toward and continue to exhibit 
extraordinary leadership in the develop- 
ment of the Information Age and will be 
inducted at the ACM Awards Banquet 
on June 26, 2010, in San Francisco, CA. 
These 47 new inductees bring the to- 
tal number of ACM Fellows to 722 (see 
www.acm.org/awards/fellows/ for the 
complete listing of ACM Fellows). 

Their works span all horizons in 
computer science and information 
technology: from the theoretical realms 
of numerical analysis, combinatorial 
mathematics and algorithmic com- 
plexity analysis; through provinces of 
computer architecture, integrated cir- 
cuits and firmware spanning personal 
computer to supercomputer design; 
into the limitless world of software and 
networking that makes computer sys- 
tems work and produces solutions and 


results that are useful—and fun—for | 


people everywhere. 
Their technical papers, books, univer- 


standing ACM members for | 


sity courses, computing programs, and 
hardware for the emerging computer/ 
communications amalgam reflect the 
powers of their vision and their ability to 
inspire colleagues and students to drive 
the field forward. The members of the 
ACM are all participants in building the 
runways, launching pads, and vehicles of 
the global information infrastructure. 


ACM Fellows 
Hagit Attiya, Technion 
David F. Bacon, IBM T.J. Watson 
Research Center 
Ricardo Baeza-Yates, Yahoo! Research 
Chandrajit L. Bajaj, 
University of Texas at Austin 


| Vijay Bhatkar, 


International Institute of 
Information Technology, Pune 


| José A. Blakeley, Microsoft Corporation 


Gaetano Borriello, 

University of Washington 
Alok Choudhary, 

Northwestern University 
Nell B. Dale, 

University of Texas at Austin (Emerita) 
Bruce S. Davie, Cisco Systems 
Jeffrey A. Dean, Google, Inc. 
Thomas L. Dean, Google, Inc. 

Bruce R. Donald, Duke University 
Thomas Erickson, 

IBM T. J. Watson Research Center 
Gerhard Fischer, 

University of Colorado 
Ian T. Foster, 

Argonne National Laboratory/ 

University of Chicago 
Andrew V. Goldberg, 

Microsoft Research Silicon Valley 
Michael T. Goodrich, 
University of California, Irvine 
Venugopal Govindaraju, 
University at Buffalo, SUNY 
Rajiv Gupta, 
University of California, Riverside 
Joseph M. Hellerstein, 
University of California, Berkeley 
Laurie Hendren, McGill University 
Urs Hoelzle, Google, Inc. 
Farnam Jahanian, 

University of Michigan 
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Erich L. Kaltofen, 
North Carolina State University 
David Karger, 
Massachusetts Institute 
of Technology 
_ Arie E. Kaufman, 
State University of New York 
at Stony Brook 
| Hans-Peter Kriegel, 
University of Munich (Ludwig- 
Maximilians-Universitaet Muenchen) 
Maurizio Lenzerini, 
Sapienza Universita di Roma 
_ John C.S. Lui, 
The Chinese University of Hong Kong 
Dinesh Manocha, 
University of North Carolina 
at Chapel Hill 
Margaret Martonosi, 
Princeton University 
Yossi Matias, Google, Inc. 
Renee J. Miller, 
University of Toronto 
John T. Riedl, 
University of Minnesota 
Martin Rinard, 
CSAIL-MIT 
Patricia Selinger, IBM Research 
R. K. Shyamasundar, 
Tata Institute of 
Fundamental Research 
Shang-Hua Teng, 
University of Southern California 
Chandramohan A. Thekkath, 
Microsoft Corporation - 
Microsoft Research 
Robbert van Renesse, 
| Cornell University 
| Baba C. Vemuri, 
University of Florida 
Paulo Verissimo, 
University of Lisbon 
Martin Vetterli, 
Ecole Polytechnic Federale 
de Lausanne (EPFL) 
Kyu-Young Whang, 
Korea Advanced Institute of Science 
and Technology (KAIST) 
Yorick Wilks, 
University of Sheffield 
| Terry Winograd, 
Stanford University 
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Privacy and Security 
Not Seeing the Crime 
for the Cameras? 


Why it is difficult—but essential—to monitor 
the effectiveness of security technologies. 


N TERMS OF Sales, remote surveil- 
lance camera systems—com- 
monly known as closed-circuit 
television (CCTV)—are a huge 
“success story. Billons of dollars 
are spent on CCTV schemes by govern- 
ments in developed countries each year, 
and sales to commercial companies and 
home users have been increasing, too. 


CCTV can be used for many purposes— | 


ranging from monitoring traffic flows on 


highways, to allowing visitors in zoos to | 


observe newborn animals during their 
first few days without disturbing them. 
The vast majority of CCTV purchases are 
made with the aim of improving safety 
and security. The London Underground 
was the first public transport operator 
to install cameras on station platforms, 
so train drivers could check doors were 
clear before closing them. CCTV has 
come a long way since then: last sum- 
mer, the technology writer Cory Doc- 
torow noticed that a single London bus 
now has 16 cameras on it (see Figure 1). 
The advance from analog to digital tech- 
nology had a major impact on CCTV: 
cameras are much smaller and cheaper, 
video is often transmitted wirelessly, 
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and recordings are stored on hard disks, 


rather than tapes. Integration with other | 


digital technologies offers further possi- 
bilities: image processing makes it pos- 
sible to recognize automobile license 
plates automatically and match them 


against databases to check if a vehicle | 


has been reported as stolen, or is unin- 
sured. Advances in hardware—such as 
high-definition cameras—and image 
processing—such as the ability to pro- 
cess face and iris information from im- 
ages taken at a distance, not detecting 


eer 
The burgeoning 

sales figures and 
ubiquity of cameras 


| suggest that surely 


CCTV technology 
must be effective. 
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unattended objects—will enable a wide 
range of possible technology solutions 
(imagine the whole industry salivating). 

The burgeoning sales figures and 
ubiquity of cameras suggest that sure- 
ly CCTV technology must be effective. 
The U.K. government has invested 
heavily in CCTV over the past 15 years, 
making it the country with the highest 
CCTV camera-to-person ratio on earth 
(Greater London alone has one cam- 
era for every six citizens). A key driver 
for adoption was that local authorities 
seeking to combat crime could obtain 
government funds to purchase CCTV. 
_ In the public debate, this policy has 
been justified mainly with two argu- 
ments: “the public wants it,” and “surely 
| it’s obvious that it works.” As evidence 
| for the latter, policymakers often point 
to high-profile (and often highly emo- 
tionally charged) cases: 

> In 1993, CCTV images from a shop- 
| ping mall camera showed police in- 
vestigators that the murdered toddler 
James Bulger had been abducted by 
two teenagers, who were then appre- 
hended and convicted. 
| »Images from London Transport 


/ 
| 
| 
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cameras led to the identification and 
apprehension of the four men who 
carried out the failed 7/21 “copycat” 
bombing attempts in 2005. 

The still images from these cases 
(see Figure 2a/b) have become iconic— 
visual proof that CCTV works. Those 
who questioned its value in the pub- 
lic debate, and dared to mention the 
“p-word”—were largely dismissed as 
“privacy cranks,” out of touch with the 
needs of policing and the wishes of or- 
dinary citizens. But over the past two 
years, new doubts have been raised 
over the benefits: 

> In summer 2008, a report by Lon- 
don police concluded that CCTV con- 
tributed to solving about 3% of street 
crimes. About £500 million ($700 mil- 
lion) has been spent on publicly fund- 
ed CCTV in Greater London. 

> In August 2009, a senior officer in 
the London police stated that, on an 
annual basis, about one crime was re- 
solved for every 1,000 cameras in oper- 
ation. He warned “police must do more 
to head off a crisis in public confidence 
over the use of surveillance cameras.” 

> In September 2009, John Bromley- 


Davenport, a leading criminal lawyer 
in Manchester, said images from CCTV 
did not prevent crime or help bring 
criminals to justice. He prosecuted 
the killers of aman kicked to death out- 
side a pub. The incident was recorded 


on CCTV, but police officers did not 


arrive in time to stop the attack, plus 
the quality of the recorded footage was 
too low to be used for identification 
purposes in court. (The killers were 
convicted on eyewitness evidence.) The 
chief executive of acompany that helps 
police analyze CCTV footage estimated 
“that about half of the CCTV cameras in 
the country are next to useless when it 
comes to safeguarding the public against 
crime and assisting the police to secure 
convictions.” Bromley-Davenport said 
that large amounts of money spent 
on technology meant less money was 
available to have police officers on the 
street—and that police presence was 
what mattered for preventing crime. 
>In October 2009, design college 
professor Mike Press called for a mora- 
torium on further CCTV deployments 
in Scotland, because the technology 
was “costly and futile [...]alazy approach 
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to crime prevention” that was dangerous 
because it created “a false sense of secu- 
rity, encouraging [citizens] to be careless 
with property and personal safety.”° 
Thus, the effectiveness of CCTV is 
being questioned in the country that 
has been a leading and enthusiastic 
adopter. Surely, this must ring alarm 
bells in the industry supplying such sys- 
tems? Not really. The industry response 
is that more advanced technology will 
fix any problems. High-definition cam- 
eras, for instance, would provide better 
image quality and increase likelihood 
of identification. The same chief ex- 
ecutive who said that half of all cur- 
rent cameras were useless suggests 
that “intelligent cameras” will improve 
effectiveness and reduce privacy inva- 
sion because they “only alerting a police 
officer when a potential incident is tak- 
ing place.” London police experts also 
hope that “future technology will boost 


conviction rates using CCTV evidence.” 
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The proposals for new technology 
and effectiveness include building a na- 
tional CCTV database of convicted of- 
fenders and unidentified suspects, and 
use of “tracking technology developed by 
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A single London bus has 16 cameras mounted on it. 


the sports advertising industry” to search 
footage for suspects and _ incidents. 
Since that technology is not quite ready, 
London police publish images of sus- 
pects on the Internet and ask the pub- 
lic for help. Recruitment of untrained 
members of the public to assist in CCTV 
monitoring is a growing trend: 

> In a London housing project, resi- 
dents have been given access to CCTV 
cameras, books of photos of individu- 
als who had been warned not to tres- 
pass on the estate, and a phone num- 
ber to call if they spotted any of them. 

» In the tourist town of Stratford-on- 
Avon, residents and business can con- 
nect their own CCTV cameras to an In- 
ternet portal, and and volunteers who 
spot and report crimes can win prizes 
of up to £1,000." 

> Approximately $2 million has been 
spent on Webcams for virtual border 
surveillance at the Texas-Mexico bor- 
der, enabling virtual local residents to 
spot and report illegal immigration. 

The involvement of untrained mem- 
bers of the public in surveillance har- 
bors many potential risks to privacy, 
public order, and public safety (e.g, 
vigilantism) that must be identified and 
considered. But even leaving those con- 
cerns aside, early indications from the 
last project suggest this not a quick fix 
to make CCTV more effective. The E/ 


a Details of the rewards were revealed last De- 
cember; see http://news.bbe.co.uk/1/hi/tech- 
nology/8393602.stm 
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| Paso Times® reported in January 2009 


that the program was not effective 
because only a dozen incidents had 
been reported. A spokesperson for the 
Governor of Texas responded that the 
problem was not with the technology, 
but the way its effectiveness was as- 
sessed. It may look like a weak argu- | 
ment, but it points to the key problem: 
How do youassess effectiveness of a se- 
curity technology such as CCTV? How 
can you determine whether the results 
represent value for the money spent on 
technology, or privacy invasions that 
occur because of its existence? 

The answer is conceptually simple: 
effectiveness of a particular deploy- 
ment means that it achieves its stated 
purpose; efficiency means the desired 
results are worth more than the re- 
sources required to achieve them. But 
the execution of a study to measure 
them is a challenging and costly exer- 


cise. One of the few controlled studies 
to date was carried out in the clothing 
retail shops in 1999': 

» The purpose of installing the systems 
was clearly defined: reduce the stock 
losses through customer and staff theft. 

> The measures for stock losses were 
clearly defined: the number and value 
of stock losses was monitored, and 
any reduction of losses calculated as a 
percentage of sales profits during the 
same period. 

» Stock losses were measured four 
times—twice during a six-month pe- 
riod before and after the introduction 
of CCTV. 

> The efficiency was calculated in 
terms of how many years the system 
would have to operate at the observed 
level of effectiveness to recover its in- 
vestment. 

» During the one-year period, they 
monitored for a number of side effects 
such as footfall, overall sales, customer 
assessment of shops, and so forth. 

This illustrates that carrying out a 
meaningful assessment under con- 
trolled conditions requires significant 
resources and domain expertise, even 
for a conceptually simple study: the 
assessment was focused on a single 
crime, the monitoring environment 
was constant, and systems for measur- 
ing the impact were already in place. 
The results showed that stock losses 
were reduced significantly in the first 
three months of CCTV introduction— 
but then rose again. After six months, 
the average loss reduction was a near- 
insignificant £4—at an average capi- 
tal expenditure of £12,000 per CCTV 
system, it would take 58 years to re- 
coup the capital cost. In the end, only 
shops selling high-value fashion using 
high-end CCTV systems reduced stock 
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| Still images from two cases that resulted in apprehension of perpetrators. 


losses to a level that would mean their 
investment was recouped within two 
years. The authors concluded that any- 
one buying an off-the-shelf CCTV sys- 


tem may be wasting their money: only | 


systems designed against a specific 
threat in a specific operating environ- 
ment are effective. 

A 2005 study of 13 CCTV systems 
funded by the U.K. government for 
crime prevention’ concluded they had 
little or no impact on crime recorded 
by the police, or on citizens’ percep- 
tion of crime (based on victimization 
rates, fear of crime and other infor- 
mation collected via local surveys). 
A common problem was that those 
who bought the systems were unclear 
about the purpose of—and hence the 
technical and operating requirements 
for—the systems. Many projects were 
driven by an “uncritical view that CCTV 
was ‘a good thing’ and that specific objec- 
tives were unnecessary.” Systems were 
bought because funding was available, 
or because a neighboring town had pur- 
chased one. There was no understand- 
ing of what CCTV could achieve, what 
types of problems it was best suited to 
alleviate, and which configuration and 
support technologies work best for 
which requirements. With buyers be- 
ing unclear about objectives and lack- 
ing expertise, the systems were gener- 


ally chosen by the salesperson—who | 


tended to pick the system that suited 
the budget. In day-to-day operations, 
it turned out that many cameras were 
ineffective because they were badly 
placed, broken, dirty, or lighting was 


insufficient—problems that were pre- | 
viously identified in London Under- | 


ground control rooms.° Both Gill and 
Spriggs’ and McIntosh* also found that 
operator performance in the control 
room was hampered by a large num- 
ber of disparate systems and informa- 
tion sources, and inefficient audio 
communication channels. Recent re- 
search by my own team’ found these 


problems continue to affect operator _ 


performance, as do ever-increasing 
camera-to-operator ratios. Recorded 


video was generally too poor to be used | 


for evidence. These problems suggest 
CCTV for crime prevention can only 
be effective as part of an overall set of 
measures and procedures designed to 
deal with specific problems. Effective 
communication and coordination be- 


The effectiveness 

of CCTV is being 
questioned in the 
country that has 
been a leading and 
enthusiastic adopter. 


tween CCTV control rooms and those 
on the ground (police, shop and bar 
staff, private security forces) is key— 
and of course there must be sufficient 
staff on the ground to respond. And 
cameras need clear lines of sight and 
sufficient lighting. We found current 
practice is still a long way off: cameras 
were ineffective because of trees and 
shrubs growing in front, and autofo- 
cus cameras broken because they were 
pointed at flags and bunting. 

Current research shows that CCTV 
for crime prevention is largely ineffec- 
tive. It is “lazy” to assume that installing 
technology solves the problem. It takes 
domain knowledge and attention to de- 


| tail to make security technology work ef- 


fectively—to date, this has been ignored, 
with expensive consequences. 
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Calendar 
of Events 


February 15-17 
International Symposium 
on BioComputing 2010, 
Calicut, India, 

Contact: Dan Tulpan, 
Phone: 506-861-0958, 
Email: dan.tulpan@ 
nre-enre.ge.ca 


February 19-21 

Symposium on Interactive 

3D Graphics and Games, 
Bethesda, MD, 

Sponsored: SIGGRAPH, 
Contact: Chris Wyman, 
Phone: 319-353-2549, 

Email: cwyman@cs.uiowa.edu 


February 22-23 

Workshop on Mobile 
Opportunistic Networking, 
Pisa, Italy, 

Sponsored: SIGMOBILE, 
Contact: Sergio Polazzo, 
Phone: 390957382370, 


| Email: polazzo@iit.unict.it 


February 22-23 

Multimedia Systems Conference 
Phoenix, Arizona, 

Sponsored: SIGMM, 

Contact: Wu-Chi Feng, 

Phone: 503-725-2408, 

Email: wuchi@cs.pdx.edu 


February 25-27 
India Software Engineering 


| Conference, 


Mysore, India, 

Contact: Srinivas 
Padmanabhuni, 

Email: s_padmana@yahoo.com 


February 26-27 
International Conference 
and Workshop on Emerging 
Trends in Technology, 
Mumbai, India, 

Contact: Poorva Girish 
Waingankar, 

Email: poorva.waingankar@ 
thekureducation.org 


March 2-5 

International Conference 

on Human Robot Interaction, 
Nara, Japan, 

Sponsored: SIGCHI, SIGART, 
Contact: Pamela J. Hinds, 
Phone: 650-723-3843, 

Email: phinds@stanford.edu 


March 2-5 

IEEE Pacific Visualization 2010, 
Taipei, Taiwan, 

Contact: Shen Han-Wei, 

Email: hwshen@ 
cse.ohio-state.edu 
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Education 


Dennis P. Groth and Jeffrey K. MacKie-Mason 


Why an Informatics Degree? 


Isn’t computer science enough? 


HAT IS AN informat- 
ics degree, and why? 
These are questions 
that have been posed 
to us on innumerable 
occasions for almost a decade by stu- 
dents, parents, employers, and col- 
leagues, and when asked to prepare a 
Communications Education column to 
answer that question, we jumped at 
the opportunity. 

The term “informatics” has differ- 
ent definitions depending on where 


it is used. In Europe, for instance, | 


computer science is referred to as in- 


formatics. In the U.S., however, infor- | 


matics is linked with applied comput- 
ing, or computing in the context of 
another domain. These are just labels, 
of course. In practice, we are educating 


for a broad continuum of computing | 


disciplines, applications, and contexts 
encountered in society today. 


From Computer 

Science to Informatics 

Computing provides the foundation 
for science, industry, and ultimately 


for the success of society. Computing | 


education traditionally has focused on 
a set of core technological and theo- 


retical concepts, and teaching these | 


concepts remains critically important. 
Meanwhile, advances in computing oc- 
cur and are driven by the need to solve 
increasingly complex problems in do- 
mains outside traditional computer 
science. Students, teachers, and schol- 


ars in other fields are keenly interested | 


in computational thinking, and com- 

puting itself increasingly is informed 

by the challenges of other disciplines. 
For example, to design good online 
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Informatics programs offer diverse applications, as shown in these scenes from 
the informatics program at Indiana University, Bloomington. 


auction technology, computer scien- | 


tists found that they needed to under- 
stand how humans would select bid- 
ding strategies given the system design, 
and indeed how to design the system 
to motivate certain types of behavior 
(truthful value revelation, for example). 
This co-design problem led to fruitful 
interdisciplinary collaborations 
tween computer scientists, economists 
and, increasingly, social psychologists. 
Likewise, designing successful tech- 
nology for trust, privacy, reputation, 


and sharing in social computing envi- | 
ronments requires both computer sci- 


ence and behavioral science. 

These interactions between problem 
domain context and computational de- 
sign are characteristic of the maturing 
of computer science. Computing is no 
longer owned solely by computer sci- 
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be- | 
| from being seen as a new field to being 


ence, any more than statistics is owned 
solely by faculty in statistics depart- 
ments. Computing and computational 
thinking have become ubiquitous, and 
embedded in all aspects of science, re- 
search, industry, government, and so- 
cial interaction. Consider the flurry of 
excitement about “e-commerce” in the 
late 1990s. Quickly e-commerce moved 


absorbed in “commerce”: the study of 
business communications, logistics, 
fulfillment, and strategy, for which the 
Internet and computing were just two 
technologies in a complex infrastructure. 

How then does computing educa- 
tion need to change to respond to the 
new reality, and more importantly, to 
be equipped to respond to future de- 
velopments? We must embrace the 
diversity of ways in which problems 
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are solved through the effective use of 
computing, and we must better under- 
stand the diverse problem domains 
themselves. 

The vision for informatics follows 


from the natural evolution of comput- | 


ing. The success of computing is in the 
resolution of problems, found in areas 
that are predominately outside of com- 
puting. Advances in computing—and 
computing education—require greater 
understanding of the problems where 
they are found: in business, science, 
and the arts and humanities. Students 
must still learn computing, but they 
must learn it in contextualized ways. 
This, then, provides a definition for in- 
formatics: informatics is a discipline 
that solves problems through the appli- 
cation of computing or computation, in 
the context of the domain of the prob- 
lem. Broadening computer science 
through attention to informatics not 
only offers insights that will drive ad- 
vances in computing, but also more op- 
tions and areas of inquiry for students, 
which will draw increasing numbers of 
them to study computation. 


Informatics Programs 

Computer science is focused on the 
design of hardware and software tech- 
nology that provides computation. 
Informatics, in general, studies the 


intersection of people, information, | 


and technology systems. It focuses on 
the ever-expanding, ubiquitous, and 
embedded relationship between infor- 
mation systems and the daily lives of 
people, from simple systems that sup- 
port personal information manage- 
ment to massive distributed databases 
manipulated in real time. The field 


helps design new uses for information — 


technology that reflect and enhance 
the way people create, find, and use 
information, and it takes into account 
the strategic, social, cultural, and orga- 
nizational settings in which those solu- 
tions will be used. 

In the U.S., informatics programs 
emerged over the past decade, though 
not always under the informatics 
name, and often in different flavors 
that bear the unique stamp of their 
faculty. Prominent examples include 
“Informatics” (Indiana University, Uni- 
versity of Michigan, University of Wash- 
ington, UC Irvine), “Human Computer 
Interaction” (Carnegie Mellon Univer- 


er 
The success of 
computing is in 

the resolution of 
problems, found 

in areas that are 
predominately 
outside of computing. 


sity), “Interactive Computing” (Georgia 
Tech), “Information Technology and 
Informatics” (Rutgers), and “Informa- 
tion Science and Technology” (Penn 
State). Some programs emerged pri- 
marily from computer science roots; 
others from information and social sci- 
ence roots. They do all generally agree 
on the centrality of the interaction of 
people and technology, and thus re- 
gardless of origin they are multidisci- 
plinary and focus on computation in 
human contexts. 

Informatics is fundamentally an 
interdisciplinary approach to domain 
problems, and as such is limited nei- 
ther to a single discipline nor a single 
domain. This is evident in another type 
of diversity in such programs: some 
take a fairly broad approach, with 
several distinct tracks or application 
domains, which can range as widely 
as art and design, history, linguistics, 
biology, sociology, statistics and eco- 
nomics. Other programs are limited 
to a single application domain, such 
as bioinformatics (for example, Iowa 


State, Brigham Young, and UC Santa | 


Cruz). Thus, informatics programs can 
have as many differences as they have 
commonalities. This has been reflect- 
ed in some confusion and frustration 


| about how to establish a community | 


| of interest. For example, there is an 


“iSchool” caucus (about 27 members), 
and a partially overlapping CRA (IT) 
Deans group (about 40 members). To 
illustrate some of the issues, we will 
describe two of the broader programs 
with which we are most familiar. 

The School of Informatics and Com- 
puting at Indiana University Blooming- 
ton offers a traditional CS degree and 
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an informatics degree, which was first 
offered in 2000. Its informatics curricu- 
lum is focused along three dimensions 
that are first presented in an introduc- 
tory course: foundations, implications, 
and applications. Unlike most tradi- 
tional computer science curricula, the 
introductory course does not focus on 
programming as the sole problem- 
solving paradigm. Instead, a number 
of skills, concepts, and problem solv- 
ing techniques are introduced and 
motivated by context-based problems, 
including logical reasoning, basic pro- 
gramming, teamwork, data visualiza- 
tion, and presentation skills. Following 
this introduction, foundations courses 
include discrete math and logical rea- 


| soning, a two-course programming 


sequence, and a course on data and 
information representation, while 
implications courses include social 
informatics and human computer in- 
teraction. The foundations topics are 
similar to those in a computer science 
program; however, the ordering is quite 
different, in that programming comes 
last rather than first. This sequencing 
increases retention in the major be- 
cause students have more time to de- 
velop their technical skills. 

At Indiana, the interdisciplinary 
component of the curriculum is ac- 


| complished through a mixture of three 


methods: elective courses covering 
technology use and issues in specific 
problem domains; a required senior 
capstone project, aimed at solving a 
“real-world” problem; and a required 
cognate specialization of at least five 
courses in another discipline. There 
are currently over 30 different special- 
izations from around 20 disciplines 
available, including: business, fine 
arts, economics, information security, 
biology, chemistry, telecommunica- 
tions, and geography. 

The School of Information (SI) at 
the University of Michigan has offered 
master’s and Ph.D. degrees in Informa- 
tion since 1996. In 2008 SI joined with 
the Computer Science and Engineering 
Division, and the College of Literature, 
Science and Arts, to offer a joint under- 
graduate informatics degree. To enter 
the major, students are required to com- 
plete one prerequisite each in calculus, 
programming, statistics, and informa- 
tion science. They then take a 16-credit 
core in discrete math, data structures, 
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statistics, and information technol- 
ogy ethics. Each then selects a several- 
course specialization track, which is in- 
terdisciplinary but focuses on providing 
depth in a particular domain: computa- 
tional informatics, information analy- 
sis, life science informatics, or social 
computing. This program establishes a 


strong foundation, domain depth and | 


interdisciplinary training. However, to 
accomplish all of this, it also imposes 
on students the heaviest required-credit 
burden of any liberal arts major. 

The equal participation by the Com- 
puter Science and Engineering Division 


_ inthe Michigan degree emphasizes the 


ability to design an informatics pro- 
gram as a complement to a traditional 
computer science degree; indeed, the 
Computer Science and Engineering Di- 


| vision continues to offer two traditional 


CS bachelor’s degrees (one in engineer- 
ing, one in liberal arts). One advantage 


expected for the contextualized infor- | 


matics degree is higher enrollment 
of women, and indeed, about half the 
class of declared majors is female. On 
the downside, managing a degree that 
spans three colleges and schools is 
challenging, with natural hurdles such 


as teaching budgets and credit approv- | 
| works than in how to make it work bet- 


als across units. 


Looking Forward 


| Informatics curricula are young and de- 


veloping, but have proven popular. Indi- 
ana has over 400 students in the major. 
In just its first year, Michigan attracted 
40 undergraduate majors. Evidence 
comes also from successful courses 
offered outside a formal informatics 
program. For example, a computer 


| scientist and an economist at Cornell 


enroll about 300 students annually in 
interdisciplinary “Networks,” which 
counts toward the majors in Computer 
Science, Economics, Sociology, and In- 
formation Science.* At the University of 
Pennsylvania, “Networked Life” (taught 
by a computer scientist) attracts about 
200 students, and satisfies require- 
ments in three majors: Philosophy, 


Politics, and Economics; Science, Tech- | 
| nology, and Society; and Computer and 


Information Science. 


a See  http://www.infosci.cornell.edu/courses/ 
info2040/2009sp/ 

b See http://www.cis.upenn.edu/~ mkearns/ 
teaching/NetworkedLife/ 
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Informatics enables students to 
combine passions for both computa- 
tion and another domain. Since almost 
all domains now benefit from compu- 
tational thinking, an informatics pro- 
gram can embrace students and con- 
centrations in art and design, history, 
linguistics, biology, sociology, statis- 
tics, and economics. This diversity has 
costs, of course. One is that for now, 
in the early years, students and faculty 
must continuously explain “informat- 


_ ics” to potential employers. Another is 


providing strong enough foundations 
in both computation and another dis- 
cipline to produce competitive, suc- 
cessful graduates. 

The desire to deeply understand 
how computing works is what has 
drawn most researchers to study com- 
puter science. These same individuals 
are then invested with the responsibil- 
ity to develop curricular programs and 
teach computing to the next genera- 
tion of computing professionals. The 
current (and all future) generations of 
students entering the university have 
largely grown up in a world where 
computing is so commonplace that 
it is taken for granted. Many of them 
are less interested in how computing 


ter in the solution of specific problems, 
drawn from virtually all other domains 
of human knowledge. There will always 
be a need for students who study com- 
puter science. Informatics provides a 
complementary path to reach other 
students for whom understanding and 
developing computation contextually 
is crucial to the problems that motivate 
them. Like mathematics, probability, 
and logic, in the future computation 
science will be taught embedded in 
many other areas. Indeed, informatics 
is a path within which the technical ac- 
complishments of computer science, 
mathematics, and statistics become 
embedded in the ways we interact, 
imagine, and produce throughout the 
scope of human experience. 
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Inside Risks 
The Need for a National 


Cybersecurity 


Development Agenda 


Government-funded initiatives, in cooperation with private-sector partners in 
key technology areas, are fundamental to cybersecurity technical transformation. 


Communications’ Inside Risks col- 
umns over the past two decades have 
frequently been concerned with trust- 
worthiness of computer-communica- 
tion systems and the applications built 
upon them. This column considers what 
is needed to attain new progress toward 
avoiding the risks that have prevailed 
in the pastas a U.S. national cybersecu- 
rity R&D agenda is being developed. Al- | 
though the author writes from the per- 
spective of someone deeply involved in 
research and development of trustwor- 
thy systems in the U.S. Department of 
Homeland Security, what is described 
here is applicable much more univer- 
sally. The risks of not doing what is de- 
scribed here are very significant. 
—Peter G. Neumann 


YBERSPACE IS THE complex, 

dynamic, globally intercon- 

nected digital and infor- 

mation infrastructure that 

underpins every facet of so- 
ciety and provides critical support for 
our personal communication, econo- 
my, civil infrastructure, public safety, 
and national security. Just as our de- 
pendence on cyberspace is deep, so 
too must be our trust in cyberspace, 
and we must provide technical and 
policy solutions that enable four 
critical aspects of trustworthy cyber- 
space: security, reliability, privacy, | 
and usability. 


| President Barack Obama greets White House 


was appointed in December 2009. 


The U.S. and the world at large are 
currently at a significant decision 
point. We must continue to defend 
our existing systems and networks. At 
the same time, we must attempt to be 
ahead of our adversaries, and ensure 
future generations of technology will 
position us to better protect critical 
infrastructures and respond to at- 
tacks from adversaries. Government- 


funded research and development | 
must play an increasing role toward | 


achieving this goal of national and 
economic security. 


FEBRUARY 2010 


Douglas Maughan 


Research and 


Cyber Security Chief Howard A. Schmidt, who 


Background 

On January 8, 2008, National Security 
Presidential Directive 54/Homeland Se- 
curity Presidential Directive 23 formal- 
ized the Comprehensive National Cyber- 
security Initiative (CNCI) and a series of 
continuous efforts designed to establish 
a frontline defense (reducing current 
vulnerabilities and preventing intru- 
sions), which will protect against the 
full spectrum of threats by using intel- 
ligence and strengthening supply chain 
security, and shaping the future environ- 
ment by enhancing our research, devel- 
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opment, and education, as well as invest- 
ing in “leap-ahead” technologies. 

No single federal agency “owns” 
the issue of cybersecurity. In fact, the 
federal government does not uniquely 
own cybersecurity. It is a national and 
global challenge with far-reaching 
consequences that requires a coopera- 
tive, comprehensive effort across the 
public and private sectors. However, 
as it has done historically, the U.S. gov- 
ernment R&D community, working in 
close cooperation with private-sector 
partners in key technology areas, can 
jump-start the necessary fundamental 
technical transformation. 


Partnerships 

The federal government must reener- 
gize two key partnerships to success- 
fully secure the future cyberspace: the 
partnership with the educational sys- 


tem andthe partnership with the private | 


sector. The Taulbee Survey’ has shown 
that our current educational system is 


not producing the cyberspace workers _ 


of the future and the current public- 
private partnerships are inadequate for 
taking R&D results and deploying them 
across the global infrastructure. 
Education. A serious, long-term 
problem with ramifications for na- 
tional security and economic growth is 
looming: there are not enough U.S. cit- 
izens with computer science (CS) and 
science, technology, engineering, and 
mathematics (STEM) degrees being 
produced. The decline in CS enroll- 
ments and degrees is most acute. The 
decline in undergraduate CS degrees 
portends the decline in master’s and 
doctoral degrees as well. Enrollments 
in major university CS departments 
have fallen sharply in the last few years, 
while the demand for computer scien- 
tists and software engineers is high 
and growing. The Taulbee Survey? 
confirmed that CS (including comput- 
er engineering) enrollments are down 


50% from only five years ago, a pre- | 


cipitous drop by any measure. Since 
CS degrees are a subset of the overall 
requirement for STEM degrees and 
show the most significant downturn, 
CS degree production can be consid- 
ered a bellwether to the overall condi- 
tion. and trend of STEM education. The 
problems with other STEM degrees are 
equally disconcerting and require im- 
mediate and effective action. At the 


30 COMMUNICATIONS OF THE ACM 


FEBRUARY 2010 


same time, STEM jobs are growing, 
and CS jobs are growing faster than 
the national average. 

At a time when the U.S. experiences 
cyberattacks daily and as global com- 
petition continues to increase, the U.S. 
cannot afford continued ineffective ed- 
ucational measures and programs. Re- 
vitalizing educational systems can take 
years before results are seen. As part of 
an overall national cybersecurity R&D 
agenda, the U.S. must incite an extraor- 
dinary shift in the number of students 
in STEM education quickly to avoid a 


serious shortage of computer scien- | 


tists, engineers, and technologists in 


| the decades to come. 


Public-Private Partnerships. nfor- 
mation and communications net- 
works are largely owned and operated 


by the private sector, both nationally | 


and internationally. Thus, addressing 
cybersecurity issues requires public- 
private partnerships as well as inter- 
national cooperation. The public and 


private sector interests are dependent | 


on each other and share a responsibil- 
ity for ensuring a secure, reliable infra- 
structure. As the federal government 
moves forward to enhance its partner- 
ships with the private sector, research 
and development must be included in 
the discussion. More and more private- 
sector R&D is falling by the wayside 
and, therefore, it is even more impor- 
tant that government-funded R&D can 
make its way to the private sector, given 
it designs, builds, owns, and operates 
most of the critical infrastructures. 


Technical Agenda 
Over the past decade there have been 
a significant number of R&D agendas 


The current public- 
private partnerships 
are inadequate for 
taking R&D results 
and deploying them 
across the global 
infrastructure. 
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published by various academic and in- 
dustry groups, and government depart- 
ments and agencies (these documents 
can be found online at http://www.cyber. 
st.dhs.gov/documents.html). A 2006 
federal R&D plan identified at least 
eight areas of interest with over 50 
project topics that were either being 
funded or should be funded by federal 
R&D entities. Many of these topic areas 
have been on the various lists for over a 
decade. Why? Because the U.S. has un- 
derinvested in these R&D areas, both 
within the government and _ private 
R&D communities. 

The Comprehensive National Cy- 
ber Initiative (CNCI) and the Presi- 
dent’s Cyberspace Policy Review? 
challenged the federal networks and 
IT research community to figure out 
how to “change the game” to address 
these technical issues. Over the past 
year, through the National Cyber Leap 
Year (NCLY) Summit and a wide range 
of other activities, the U.S. government 
research community sought to elicit 
the best ideas from the research and 
technology community. The vision of 
the CNCI research community over the 
next 10 years is to “transform the cyber- 
infrastructure to be resistant to attack 
so that critical national interests are 
protected from catastrophic damage 
and our society can confidently adopt 
new technological advances.” 

The leap-ahead strategy aligns with 
the consensus of the U.S. networking 
and cybersecurity research communi- 
ties: That the only long-term solution to 


| the vulnerabilities of today’s network- 


ing and information technologies is to 
ensure that future generations of these 
technologies are designed with security 
built in from the ground up. Federal 
agencies with mission-critical needs 
for increased cybersecurity, which in- 
cludes information assurance as well as 
network and system security, can playa 
direct role in determining research pri- 
orities and assessing emerging technol- 
ogy prototypes. 

The Department of Homeland Secu- 
rity Science and Technology Director- 
ate has published its own roadmap in 
an effort to provide more R&D direction 
for the community. The Cybersecurity 
Research Roadmap’ addresses a broad 
R&D agenda that is required to enable 


| production of the technologies that will 


protect future information systems and 


networks. The document provides de- 
tailed research and development agen- 
das relating to 11 hard problem areas 
in cybersecurity, for use by agencies of 
the U.S. government. The research top- 


ics in this roadmap, however, are rel- | 


evant not just to the governments, but 
also to the private sector and anyone 
else funding or performing R&D. 

While progress in any of the areas 
identified in the reports noted previous- 
ly would be valuable, I believe the “top 
10” list consists of the following (with 
short rationale included): 

1. Software Assurance: poorly writ- 
ten software is at the root of all of our 
security problems; 

2. Metrics: we cannot measure our 
systems, thus we cannot manage them; 

3. Usable Security: information se- 
curity technologies have not been de- 
ployed because they are not easily usable; 

4, Identity Management: the ability 
to know who you are communicating 
with will help eliminate many of today’s 
online problems, including attribution; 

5. Malware: today’s problems contin- 
ue because of a lack of dealing with ma- 
licious software and its perpetrators; 

6. Insider Threat: one of the biggest 
threats to all sectors that has not been 
adequately addressed; 

7. Hardware Security: today’s com- 
puting systems can be improved with 
new thinking about the next generation 
of hardware built from the start with se- 
curity in mind; 

8. Data Provenance: data has the 
most value, yet we have no mechanisms 
to know what has happened to data 
from its inception; 

9. Trustworthy Systems: current sys- 
tems are unable to provide assurances 
of correct operation to include resil- 
iency; and 

10. Cyber Economics: we do not un- 
derstand the economics behind cyber- 
security for either the good guy or the 


bad guy. 


Life Cycle of Innovation 
R&D programs, including cybersecu- 
rity R&D, consistently have difficulty 
in taking the research through a path 
of development, testing, evaluation, 
and transition into operational envi- 
ronments. Past experience shows that 
transition plans developed and applied 
early in the life cycle of the research 
program, with probable 


In order to achieve 


| the full results of 


R&D, technology 
transfer needs to bea 
key consideration for 
all R&D investments. 


paths for the research product, are ef- 
fective in achieving successful transfer 
from research to application and use. 
It is equally important, however, to ac- 
knowledge that these plans are subject 
to change and must be reviewed often. 


| It is also important to note that differ- 


ent technologies are better suited for 


different technology transition paths | 


and in some instances the choice of the 
transition path will mean success or 
failure for the ultimate product. There 
are guiding principles for transitioning 


volve lessons learned about the effects 
of time/schedule, budgets, customer 
or end-user participation, demonstra- 
tions, testing and evaluation, product 
partnerships, and other factors. 

A July 2007 U.S. Department of De- 
fense Report to Congress on Technol- 
ogy Transition noted there is evidence 
that a chasm exists between the DoD 
S&T communities and acquisition of 
a system prototype demonstration in 
an operational environment. DOD is 
not the only government agency that 
struggles with technology transition. 
That chasm, commonly referred to as 
the “valley of death,” can be bridged 
only through cooperative efforts and 
investments by both research and ac- 
quisition communities. 


| research products. These principles in- | 


There are at least five canonical tran- | 


sition paths for research funded by the 
federal government. These transition 
paths are affected by the nature of the 
technology, the intended end user, par- 


ticipants in the research program, and | 


other external circumstances. Success 
in research product transition is often 
accomplished by the dedication of the 
program manager through opportu- 


nistic channels of demonstration, part- 
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However, no single approach is more 
effective than a proactive technology 
champion who is allowed the freedom 


_ to seek potential utilization of the re- 


search product. The five canonical tran- 
sition paths are: 
> Department/Agency direct to 
Acquisition 
> Department/Agency to 
Government Lab 
> Department/Agency to Industry 
> Department/Agency to 
Academia to Industry 
> Department/Agency to 
Open Source Community 
In order to achieve the full results of 
R&D, technology transfer needs to be 
a key consideration for all R&D invest- 
ments. This requires the federal gov- 
ernment to move past working models 


_ where most R&D programs support only 


limited operational evaluations and ex- 
periments. In these old working mod- 
els, most R&D program managers con- 
sider their job done with final reports, 
and most research performers consider 
their job done with publications. In or- 
der to move forward, government-fund- 
ed R&D activities must focus on the real 
goal: technology transfer, which follows 
transition. Current R&D principal inves- 
tigators (PIs) and program managers 
(PMs) aren’t rewarded for technology 
transfer. Academic PIs are rewarded for 
publications, not technology transfer. 
The government R&D community must 
reward government program managers 
and PIs for transition progress. 


Conclusion 

As noted in the White House Cyber- 
space Policy Review,’ an updated na- 
tional strategy for securing cyberspace 
is needed. Research and development 
must be a full partner in that discus- 
sion. It is only through innovation cre- 
ation that the U.S. can regain its posi- 
tion as a leader in cyberspace. 
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Open Access to 
Scientific Publications 


The good, the bad, and the ugly. 


N HIS JULY 2009 Communica- 
tions editor’s letter “Open, 
Closed, or Clopen Access?”, 
editor-in-chief Moshe Vardi 


addressed the question of | 


open access to this magazine and 
to ACM publications in general. Sci- 


entific publishing, like all areas of | 


publishing, is undergoing major 
changes. One reason is the advent 


of the Internet, which fosters new | 


types of publishing models. Another 
less-known factor is the exponential 
increase in the number of scientific 
publications (see the figure here), 
which has turned this area into a seri- 
ous business. In this column, I take a 
look at commercial and Open Access 


publishing, and at the role that pro- | 


fessional societies such as ACM can 
play in this evolving world. 


Commercial Publishing 

Scientific publishing is a profitable 
business: at more than 30%, the op- 
erating profit margins of major com- 
mercial publishers are one of the 
highest across all businesses.* A ma- 


jor consequence has been a massive | 
concentration of commercial editors | 


of scientific, technical, and medical 
(STM) publications, with one giant (El- 
sevier) and a few big players (Springer, 
Thomson, Wiley). This concentration 
has coincided with sharp increases 
in subscription rates, and has gener- 
ated razor-sharp business practices 


a_ See, for example, http://www.researchinforma- 
tion.info/features/feature.php?feature_id=141 
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whereby, for example, publishers sell 
subscriptions to a bundle of titles that 
typically contain one or two good jour- 
nals among a set of second-tier ones. 
The quality of a journal is typically 


| measured by its impact factor—the av- 
erage number of citations to articles in | 


this journal over a unit time (typically 


| three years). Because of the competi- 
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tion among publishers, impact factors 
can be, and are, manipulated: Com- 
mercial publishers ask their editors-in- 
chief to “encourage” authors of accept- 
ed papers to include references to their 
journals. (Since they pay their editors- 
in-chief, it makes them more “recep- 
tive” to such requests.) The Web-based 
version of EndNote, the well-known 
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reference searching tool, facilitates ref- 
erences to publications indexed by ISI 
Web-of-Science, the division of Thom- 
son that computes the very impact fac- 
tors mentioned previously. 

Over the years, commercial STM 
publishing has become a cutthroat 
business with cutthroat practices and 
we, the scientific and academic com- 
munity, are the naive lambs, blinded 
by the ideals of science for the public 
good—or simply in need of more pub- 
lications to advance our careers. 

Fortunately, a number of research- 
ers and academic leaders woke up 
one day and said: “We do not need 
commercial publishers. We want the 
results of our research, which is of- 
ten funded by taxpayers’ money, to be 
available for free to the public at large. 
With the Internet, the costs of publish- 
ing are almost zero, and therefore we 
can make this work.” And so was born 
the white knight of STM publishing: 
Open Access. 


Open-Access Publishing 

But the proponents of Open Access 
quickly realized that online publishing 
is not free, nor cheap. Management, 
equipment, and access costs add up 
quickly. For example, ACM spends sev- 
eral million dollars every year to sup- 
port the reliable data center serving the 
Digital Library’ and to incorporate new 
data, improve cross-references, and de- 
velop new services. 

Since Open Access needs funding, 
where can it come from?‘ An obvious 
answer is advertising, but it is nota sus- 
tainable option at least for now. A less 
obvious answer, but one that is quickly 
gaining momentum, is called author 
charges (or publication fees): since 
Open Access does not charge readers, 
authors will pay to publish their works. 
This should be painless for authors be- 
cause they are also readers: it simply 
transfers charges from subscriptions 
to authorship. In fact, the proponents 
of this model explicitly encourage re- 
searchers to include author charges 
in their budgets when they apply for 
grants. The NIH explicitly supports 


b As an example, on Sept. 11, 2001, ACM was 
prepared to switch to a backup database in 
another location in the country to provide un- 
interrupted access to the Digital Library. 

c See http://www.arl.org/sparc/publisher/in- 
comemodels/ for a fairly complete list. 
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Exponential increase in the scientific production in the medical (MED) and natural sciences 
and engineering (NSE) fields. The vertical scale is logarithmic. The number of published 


articles for 2004 is about 500,000 and the number of references is about 10 million. 


Data provided by Yves Gingras. 
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Open Access and accepts such costs. 
But how much are authors ready to 
pay to publish an article? A few hun- 
dred dollars? The most prominent 
Open Access publisher, the Public Li- 
brary of Science (PLOS), is a nonprofit 
organization that has received several 
million dollars in donations. Yet it 
charges between $1,350 and $2,900 
per paper, depending on the journal.‘ 


In fact, many in the profession esti- | 


mate that to be sustainable, the au- 
thor-pay model will need to charge up 
to $5,000-$8,000 per publication. 
Consider what this means. For ex- 
ample, I am the head of the Laborato- 
ry for Computer Science at Université 
Paris-Sud in France. We publish over 


conservative estimate of $2,500 per 
article, the author fees would cost us 
$250,000 per year. This is more than 


four times our current budget for | 


journal subscriptions. And, of course, 
since not every publisher is going 
to turn to that model overnight, we 


would have to keep traditional sub- | 
scriptions. At $5,000 per publication, | 


my lab is broke. 

Funding agencies are unlikely to 
cover these extra costs. If they do, it 
will be within the same overall bud- 
get, meaning less money for manpow- 
er, equipment, and travel. Also, how 
would funding agencies pay for papers 


is often the case with journal publica- 
tions? How would researchers decide 
between two papers when budgets are 
tight? More than ever, the rich will be 


d_ See http://www.plos.org/journals/pubfees.html 
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able to publish more and more easily 
than the poor. And even though Open 
Access publishers do have policies to 
lower or waive the fees for those who 
cannot pay, it is embarrassing just to 
have to ask. 

In fact, those who benefit the most 
from this model are neither the scien- 
tific community nor the general pub- 
lic. They are the big pharmaceutical 
labs and the tech firms who publish 
very little but rely on the publication 
of scientific results for their business- 
es.° With author-pay, research will pay 
so that industry can get their results 
for free. Is this moral? The only other 
area in publishing where authors pay 


| to get published is called the vanity 
100 journal articles annually. At the | 


press. Do we really want to enter that 
model? 

Not surprisingly, commercial pub- 
lishers have considered Open Access a 
potential threat. But they quickly real- 
ized that the author-pay model could 
work for them, too. Many publishers 
are already testing a dual-model: au- 
thors can publish an article without 
charge, in which case it is available to 
subscribers only, or with an author- 
charge, in which case it is available for 
free. This is the best of both worlds: 
charging both readers and authors! 

So while Open Access was designed 
to provide an alternative to commer- 


_ cial publishing, it may well be con- 
published after the end of a grant, as | 


sumed by it. Now, authors, not just 
readers, are the publishers’ market. 


e Elsevier has admitted to creating fake journals 
sponsored by pharmaceutical labs (see, for 
example, http://www.the-scientist.com/blog/ 
display/55679/) 
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For example, I can easily imagine these 
publishers soon offering universities 
special deals with reduced author fees 
in exchange for exclusive rights to the 
publications of that university, jeopar- 
dizing academic freedom. 


The Role of Professional Societies 
Can we get out of this situation? Can 
we escape both the escalating subscrip- 
tion fees of commercial editors and the 
dangerous author fees of prominent 
Open Access publishers? It is impor- 
tant to understand that the scientific 
community is largely at fault: we sit on 
the editorial boards of the very jour- 
nals published at exorbitant prices by 
commercial publishers,’ and we sub- 
mit our best articles to these journals. 
The problem with the subscrip- 
tion model is not the model but the 
fees. Rob Kirby, of the UC Berkeley 
Math Department, has compared 
the cost-per-page of various math- 
ematics journals, computed as the 
subscription price divided by the 
number of pages published annual- 
ly.6 In 1997, they ranged from $0.07 
to $1.53. The cost per 10,000 char- 
acters, which better accounts for 
differences among journal formats, 
ranged from 30 cents to $3. Con- 
sistently, the cheaper journals are 
published by universities and societ- 
ies; the most expensive ones by com- 
mercial publishers. In 2003, Donald 
Knuth, editor of Journal of Algorithms, 
wrote a long letter to his editorial 
board explaining that the price per 
page of the journal had more than 
doubled since it had been acquired 
by Elsevier, while it had stayed stable 
over the previous period, when it was 
published by Academic Press. This 
led to a mass resignation of the board 
and the rebirth of the journal as ACM 
Transactions on Algorithms. Another 
well-known example is the Journal of 
Machine Learning Research, which be- 
came its own Open Access publisher 
for similar reasons. A number of 
journals have joined this trend,’ but 


f JI am an associate editor of an Elsevier-pub- 
lished journal. 

g See http://math.berkeley.edu/~ kirby/journals. 
html 

h_ See http://www-cs-faculty.stanford.edu/~ knuth/ 
joalet.pdf 

i Fora list, see http://oad.simmons.edu/oadwi- 
ki/Journal_declarations_of_independence 
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ES | copyright is seen by some as a serious 


Open Access is a 
valuable goal, but the 
scientific community 
is overly naive about 
the whole business of 
scientific publishing. 


few have turned to Open Access. 

So, am I against Open Access? No. I 
think it is a noble goal, an achievable 
goal. But this goal should not blind us 
to the point of making a bad system 
even worse, of hurting research in the 
name of making its results freely avail- 
able to everyone. 

First, scientific publications can 
be affordable. The pricing of the 


hindrance to open access, as it deprives 
authors from distribution rights. While 
copyright transfer offers authors pro- 


_ tection (such as against plagiarism) 


and services (such as authorization to 
reprint), I believe switching to a licens- 
ing model such as Creative Commons 


| could be beneficial. 


The added value provided by pub- 
lishers is twofold: reputation (the value 
of the imprimatur), and archiving (the 
guarantee that the work will be avail- 


_ able forever). These allow publishers 
| to provide services that self-publishing 
| and even institutional repositories 


cannot provide, such as the author 
pages that were recently added to the 


_ ACM Digital Library. Little if any of this 


relies on the actual transfer of copy- 
right. While publishers value the ex- 
clusivity granted by copyright transfer, 


| users (authors and readers alike) value 


ACM Digital Library is extremely low, | 


even compared to other societies and 
nonprofit organizations. This is still 
not enough. The pricing model is ad- 
equate for the academic and industry 
audience but not for dissemination 
toward the public at large. As shown 
by the success of online stores such 
as iTunes, low-pricing can translate 
into large volumes. Commercial pub- 
lishers charge non-subscribers up to 
$30 to download a single paper; ACM 
charges $15. What if it were 99 cents? 
While I am not saying that scientific 
publishing is a mass market like 
music, I do believe this would dra- 
matically reduce the barrier to non- 
subscribers, in particular the general 
public, without significantly affecting 
the revenues from subscriptions. 
Second, much of this debate has 
focused on cost. But free access is, to 


paraphrase the Free Software Move- | 


ment, as much about free beer as it is 
about free speech. Many publishers, 
including ACM, allow their authors to 


publish copies of their articles on their | 
personal Web page or on their institu- | 


tional repository.) But the transfer of 


j See section 2.5 of the ACM copyright policy, 


http://www.acm.org/publications/policies/ 


copyright_policy, and the SHERPA/ROMEO | 


list of publishers’ copyright and self-archiving 
policies, http://www.sherpa.ac.uk/romeo/ 
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the services that make articles easier 
to find: indexing, cross-referencing, 
searching, and so forth. A proper li- 
censing model could foster novel 
services for scientific dissemination, 
including by third parties, without 
challenging the primary values and 
revenue streams of publishers, in par- 
ticular non-profit ones. 


Conclusion 

Open Access is a valuable goal, but the 
scientific community is overly naive 
about the whole business of scientific 
publishing. Societies and nonprofit 
organizations need to continue to lead 
the way to improve the dissemination 
of research results, but the scientific 
community at large must support them 


_ against the business-centric views of 


commercial publishers. Author fees are 


_ not a solution. Worse, they jeopardize 


the ecological balance of the research 
incentive structure. Finally, nonprofit 
publishers should take advantage of 
their unique position to experiment 
with sustainable evolutions of their 
publishing models. 


Michel Beaudouin-Lafon (mbl@lri.fr) is a professor of 
computer science at Université Paris-Sud (France) and 
head of the Laboratory for Computer Science (LRI), 
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Article development led by 
queue.acm.org 


() AQueve 


Kode Vicious 
Taking Your Network’s 


Temperature 


A prescription for capturing data to diagnose and debug a networking problem. 


Dear KV, 


I posted a question on a mailing list | 


recently about a networking problem 
and was asked if I had a tcpdump. The 
person who responded to my ques- 
tion—and to the whole list as well— 


seemed to think my lack of networking | 


knowledge was some kind of affront to 
him. His response was pretty much a 
personal attack: If I couldn’t be both- 
ered to do the most basic types of de- 
bugging on my own, then I shouldn’t 
expect much help from the list. Aside 
from the personal attack, what did he 
mean by this? 
Dumped 


Dear Dumped, 

It is always interesting to me that when 
people study computer programming 
or software engineering they are taught 
to use the creative tools—editors to 
create code, compilers to take that 
code and turn it into an executable— 
but are rarely, if ever, taught how to 
debug a program. Debuggers are pow- 
erful tools, and once you learn to use 
one you become a far more productive 
programmer because, face it, putting 
printf ()—or its immoral equiva- 
lent—throughout your code is a really 
annoying way to find bugs. In many 
cases, especially those related to tim- 
ing issues, adding print statements 
just leads to erroneous results. If the 
number of people who actually learn 


how to debug a program during their 
studies is small, the number who learn 
how to debug a networking problem is 


| minuscule. I actually don’t know any- 


one who was ever directly taught how 
to debug a networking problem. 

Some people—the lucky ones—are 
eventually led to the program you men- 
tion, tcpdump, or its graphical equiva- 
lent, wireshark, but I’ve never seen 
anyone try to teach people to use these 
tools. One of the nice things about 


FEBRUARY 2010 


tcpdump and wireshark is that they’re 
multi-platform, running on both Unix- 
like operating systems and Windows. 
In fact, writing a packet-capture pro- 
gram is relatively easy, as long as the 
operating system you’re working with 
gives you the ability to tap into the net- 
working code or driver at a low enough 
level to sniff packets. 

Those of us who spend our days 
banging our heads against networking 
problems eventually learn how to use 
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these tools, sort of in the way that early | 


humans learned to cook flesh. Let’s 
just say that though the results may 
have been edible, they were not win- 
ning any Michelin stars. 

Using a packet-capture tool is, to 
a networking person, somewhat like 


using a thermometer is to a parent. It | 


is likely that if you ever felt sick when 
you were a child at least one of your 


If they took you to the doctor, the doc- 


I once had my temperature taken fora 
broken ankle—crazy, yes, but that doc- 
tor gave the best prescriptions, so I just 
smiled blithely and let him have his 
fun. That aside, taking a child’s tem- 
perature is the first thing on a parent’s 
checklist for the question “Is my child 
sick?” What on earth does this have to 
do with capturing packets? 

By far the best tool for determining 
what is wrong with programs that use 
a network, or even the network itself, is 
the tepdump tool. Why is that? Surely 
in the now 40-plus years since pack- 
ets were first transmitted across the 
original ARPANET we have developed 
some better tools. The fact is we have 
not. When something in the network 
breaks, you want to be able to see the 
messages at as many layers as possible. 

The other key component in debug- 
ging network problems is understand- 
ing the timing of what happens, which 
a good packet-capture program also 
records. Networks are perhaps the 
most nondeterministic components 
of any complex computing system. 
Finding out who did what to whom 
and when (another question parents 
often ask, usually after a fight among 
siblings) is extremely important. 

All network protocols, and the pro- 
grams that use them, have some sort 
of ordering that is important to their 
functioning. Did a message go miss- 
ing? Did two or more messages arrive 
out of order at the destination? All of 
these questions can potentially be 
answered by using a packet sniffer to 
record network traffic, but only if you 
use it! 

It’s also important to record the net- 
work traffic as soon as you see the prob- 
lem. Because of their nondeterministic 
nature, networks give rise to the worst 
types of timing bugs. Perhaps the bug 
happens only every so many hours, be- 


36 COMMUNICATIONS OF THE ACM 


FEBRUARY 2010 


Using a packet- 
capture tool is, 

to a networking 
person, somewhat 
like using a 


| thermometer is 
parents would take your temperature. | 


| to a parent. 


tor would also take your temperature. _ 


cause of a rollover in a large counter; 
you really want to start recording the 
network traffic before the bug occurs, 
not after, because it may be many hours 
until the condition comes up again. 

So, here are some very basic recom- 
mendations on using a packet sniffer 
in debugging a network problem. First, 
get permission (yes, it really is KV giv- 
ing you this advice). People get cranky 
if you record their network traffic, such 
as instant messages, email, and bank- 
ing transactions, and then post it toa 
mailing list. Just because some person 
in IT was dumb enough to give you root 
or admin rights on your desktop does 
not mean you should just record every- 
thing and send it off. 

Next, record only as much informa- 
tion as you need to debug the problem. 
If you’re new at this you’ll probably 
have the program suck up every packet 
so you don’t miss anything, but that’s 
problematic for two reasons: the first 
is the previously mentioned privacy is- 
sue; and the second is that ifyou record 
too much data, finding the bug will be 
like finding a needle in a haystack—on- 
ly you’ve never seen a haystack that big. 
Recording an hour of Ethernet traffic 
on your LAN can capture a few hundred 
million packets. No matter how good a 
tool you have, it’s going to do a much 
better job at finding a bug if you narrow 
down the search. 

If you do record a lot of data, don’t 
try to share it all as one huge chunk. 


| See how these points follow each oth- | 


er? Most packet-capture programs 
have options to say, “Once the cap- 
ture file is full, close it and start a new 
one.” Limiting files to one megabyte is 
a nice start. 
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Finally, do not record your data ona 
network file system. There is no better 
way to ruin a whole set of packet-cap- 
ture files than by having them capture 
themselves. 

So there you have it: a brief intro- 
duction to capturing data so you can 
debug a networking problem. Perhaps 
now you can get yelled at on a mailing 
list for something more egregious than 
not taking your network’s temperature 
before calling the doctor. 

KV 
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on queue.acm.org 
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Dedication 
I would like to dedicate this column 
to my first editor, Mrs. B. Neville-Neil, 


_ who passed away after a sudden illness 


on December 9th, 2009; she was 65 
years old. 

My mother took language, both writ- 
ten and spoken, very seriously. The last 
thing I wanted to hear upon showing 
her an essay I was writing for school 
was, “Bring me the red pen.” In those 
days I did not have a computer; all my 
assignments were written longhand 
or on a typewriter and so the red pen 
meant a total rewrite. She was a tough 
editor, but it was impossible to question 
the quality of her work or the passion 
that she brought to the writing process. 
All of the things Strunk and White have 
taught others throughout the years my 
mother taught me, on her own, with the 
benefit of only a high school education 
and a voracious appetite for reading. 

It is, in large part, due to my moth- 
er’s influence that I am a writer today. 
It is also due to her influence that I re- 
view articles, books, and code on pa- 
per, using ared pen. Her edits and her 
unswerving belief that I could always 
improve are, already, keenly missed. 

—George Vernon Neville-Neil III 
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An Interview with 
Michael Rabin 


Michael O. Rabin, co-recipient of the 1976 ACM A.M. Turing Award, 
discusses his innovative algorithmic work with Dennis Shasha. 


FIRST ENCOUNTERED Michael 
Rabin in 1980 when I was a 
first-year Ph.D. student at Har- 
vard University. Itwas fouryears 
after Rabin and Dana Scott 
won the ACM A.M. Turing award for 
their foundational work on determin- 
istic and nondeterministic finite state 
automata. By 1980, however, Rabin’s 
interests were all about randomized 
algorithms. His algorithms course was 
a challenge to all of us. Sometimes he 
presented a result he had worked out 
only two weeks earlier. When I had dif- 
ficulty understanding, I would make 


puzzles for myself. As a consequence, | 


when I published my first puzzle book 
The Puzzling Adventures of Dr. Ecco, 1 
dedicated the book to Michael as one 


of my three most influential teachers. | 


At Harvard, I enjoyed every encounter 
with Michael—in seminars and at par- 
ties. He was a great raconteur anda 
great joker. In 1994, journalist Cathy 
Lazere and I embarked on the writ- 
ing of Out of Their Minds: The Lives and 
Discoveries of 15 Great Computer Scien- 
tists. The goal was to interview great 
living seminal thinkers of our field: 
Knuth, Rabin, Dijkstra, among others. 
Each thinker was associated with an 
epithet: Michael’s was “the possibili- 
ties of chance.” 

—Dennis Shasha, 
Courant Institute, New York University, 


Shasha: I’m going to try to get toa 
mixture of personal and _ technical 
recollections, let’s start in Israel. You 


finished high school, and what were 
you thinking about doing after high 
school? 


Rabin: I’ve been intensely inter- 
ested in mathematics ever since I was 
11 years old, when I was kicked out of 
class and told to stand in the hall. I en- 
countered two ninth-graders who were 


_ working on geometry problems and 


asked them what they were doing. They 
said they had to prove such-and-such a 
statement, and then condescendingly 
said to me, “You think you can do it?” 
Even though I never studied geometry, 
I was able to solve that problem. The 
fact that you can by pure thought prove 
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statements about the real world of 
lines, circles, and distances impressed 
me so deeply that I decided I wanted to 
study mathematics. 

My father sent me to the best high 
school in Haifa, where I grew up, and 
in 10th grade I had the good luck to 
encounter a real mathematician, El- 
isha Netanyahu (the uncle of Benja- 
min Netanyahu) who was at the time a 
high school teacher. Later he became 
a professor at the Haifa Institute of 
Technology. He conducted once a 
week a so-called “mathematical cir- 
cle,” where he taught a selected group 
of students number theory, combina- 
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torics, and advanced algebra. Netan- 
yahu lent me advanced mathematics 
books. Starting at age 15, I read Hardy 
and Wright’s book An Introduction to 


the Theory of Numbers, the first volume | 


in German of Landau’s Zahlentheorie. 
I studied a wonderful book by G.H. 
Hardy called Pure Mathematics, which 
was mainly analyses, two volumes of 


Knopp’s Functions of a Complex Vari- | 


able, A. Speiser’s book Gruppentheorie, 
and soon. 

I finished high school at age 16% 
because the war of independence 
broke out in Israel, and everybody in 
my class was drafted into the army. 
I continued to study mathematics 
on my own while in the army. Then I 
got in touch with Abraham Fraenkel, 
who was a professor of mathematics 
in Jerusalem and whose book on set 
theory I had studied. That was maybe 


in September of 1949. Fraenkel met | 


and tested me on group theory, num- 
ber theory, set theory, and decided 
that it would be quite worthwhile if I 
came to the university. He got in touch 
with army authorities, and I was dis- 
charged from the army to go and study 
at the university. 

The system was that you went di- 
rectly to a master’s degree. I studied 
a lot of mathematics: set theory, alge- 
bra, functions of a complex variable, 
linear spaces, differential equations, 
probability as well as physics and ap- 
plied mathematics. I was mainly inter- 
ested in algebra. So in 1952 I wrote a 
master’s thesis on the algebra of com- 
mutative rings, and solved an open 
problem due to Emmy Noether, giving 
a necessary and sufficient condition 
on a commutative ring for having the 
property that every ideal is a finite in- 
tersection of primary ideals. That was 
one of my first papers. It appeared in 
the Comptes Rendus (the proceedings) 
of the French Academy of Sciences. 


Princeton Days and String 
Automata Theory 


I went to Princeton University to study | 


with Alonzo Church. At the time, only 
13 Ph.D. students were admitted every 
year to the Mathematics Department. 
I found out the committee that was 
considering me assigned somebody 
to look at the master’s thesis paper, 
and this is what led to my admission. 
My Ph.D. thesis was on computation- 
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al (recursive) unsolvability of group 


theoretical problems, thus combining | 
my interests in algebra and comput- 


ability. 


In 1957, IBM decided to go into re- | 


search in a big way. They sent people 
to recruit promising students from 
top universities and top mathematics 
departments. Dana Scott and I were 
invited to spend the summer at IBM 
Research. Watson Labs did not exist at 


that time, and research was located at | 


the Lamb Estate in Westchester Coun- 
ty. This had previously been an insane 
asylum where, rumor had it, rich fami- 
lies would forcibly confine trouble- 
some members. 

At the Lamb Estate, Dana and I 


Their Decision Problems.” Automata 
theory had started with a study by 
Walter Pitts and Warren McCulloch 
of what would now be called neural 
networks. They assumed that those 
neural networks essentially embod- 


ied a finite number of states, and | 
talked about what is recognizable by | 


such finite state neural networks, but 
they didn’t have a complete charac- 
terization. Later, in 1956, S.C. Kleene 
invented regular languages: sets of 
strings that are obtained from finite 
sets by certain simple operations. 
Kleene showed that finite neural nets 
compute or recognize exactly the reg- 
ular languages. 

We decided to completely abstract 
away from neural nets and consider a 
finite-state machine that gets symbol 
inputs and undergoes state transi- 
tions. If upon being fed a string of 
symbols it passes from an initial state 


to one of anumber of accepting states, | 


then the string is accepted by the 


ey ee EEN 
I set myself the goal 
of finding more and 
more applications 

of randomization in 
mathematics and 
computer science. 
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finite-state machine. The finite-state 
machine had no output except for say- 
ing yes/no at the end. 

Dana and I asked ourselves in what 
ways could we expand the finite-state 
machine model. One of our extensions 
was to consider nondeterministic ma- 
chines. We really did not have any deep 
philosophical reason for considering 
nondeterminism, even though as we 
now know nondeterminism is at the 
center of the P = NP question, a prob- 
lem of immense practical and theo- 
retical importance. For us, it was just 
one of the variants. We stipulated that 
the automaton when in state S and 
upon input symbol sigma can go into 


_ any one of anumber of states S’, S”,..., 
wrote the paper “Finite Automata and | 


comprising a certain subset of the set 
of states. The nondeterministic finite- 
state machine accepts a string if there 
is a possible computation, a possible 
sequence of transitions, leading to an 
accepting state. Then we proved that 
finite-state nondeterministic autom- 
ata are exactly equal in power to de- 
terministic automaton computations. 
[On a practical level, this means that 
the wildcard searches of say grep, perl, 
or python can be expressed as nonde- 
terministic finite state automata and 
then can be translated into determin- 
istic finite state automata. | 

The corresponding question wheth- 
er nondeterministic polynomial time 
Turing machine computations are 
equal in power to deterministic poly- 
nomial time Turing machine compu- 
tations is the famous P = NP problem. 

Employing nondeterministic au- 
tomata, we were able to re-prove 


Kleene’s result that finite state ma- 


chines exactly accept regular lan- 
guages. The proof became very easy. 
We also introduced and studied other 
extensions of finite automata such as 
two-way automata, multi-tape and lin- 
early bounded automata. The latter 
construct appeared in our research 
report but did not find its way into the 


| published paper. 


Origins of Complexity Theory 

The next summer I again went to the 
Lamb Estate. At that time there was 
a methodologically misled widely 
held view that computing, and what 
later became computer science, was a 
sub-field of information theory in the 
Shannon sense. This was really ill-con- 


ceived because Shannon was dealing 
with the information content of mes- 
sages. If you perform a lengthy calcu- 
lation on a small input then the infor- 
mation content in the Shannon sense 
of the outcome is still small. You can 
take a 100-bit number and raise it to 
the power 100, so you get a 10,000-bit 
number. But the information content 
of those 10,000 bits is no larger than 
that of the original 100 bits. 

John McCarthy posed to me a puz- 


zle about spies, guards, and password. | 


Spies must present, upon returning 
from enemy territory, some kind of 
secret password to avoid being shot 
by their own border guards. But the 
guards cannot be trusted to keep a se- 
cret. So, ifyou give them the password, 
the enemy may learn the password and 
safely send over his own spies. Here is 
a solution. You randomly create, say, a 


100-digit number x and square it, and | 


give the guards the middle 100 digits 
of x2. John von Neumann had sug- 
gested the middle square function for 
generating pseudo-random numbers. 
You give the number x to the spy. Upon 
being challenged, the spy presents x. 
The guard computes x2 and compares 
the middle square to the value he has. 
Every password x is used only once. 
The whole point is that presumably it 
is easy to calculate the middle square, 
but it is difficult, given the middle 
square, to find one of the numbers 
having that value as a middle square. 
So even if the guards divulge the mid- 
dle square, nobody else can figure out 
the number the spy knows. 


But how do you even define that dif- | 


ficulty of computing? And even more 
so, how do you prove it? I then set 
myself to study that problem. I wrote 
an article called “Degree of Difficulty 
of Computing a Function and Hierar- 
chy of Recursive Sets.” In that article, 
I wasn’t able to solve the problem of 
showing that the von Neumann func- 
tion is difficult to invert. This is really 
a special case of the P = NP problem. 
It hasn’t been settled to this day. But 
I was able to show that for every com- 
putable function there exists another 
computable function that is more 
difficult to compute than the first 
one, regardless of the algorithm or 
programming language one chooses. 
It’s similar to the minimum energy 
required for performing a physical 


| The next significant 


application of 

my work on 
randomization was 
to cryptography. 


task. If this phone is on the floor and 
I have to raise it, there is a minimum 
amount of work. I can do it by pulling 
it up, by putting a small amount of ex- 
plosive, blowing it up here, but there 
is a certain inherent amount of work. 
This is what I was studying for compu- 
tations. 

I think this paper, no less than 
the Rabin/Scott paper, was a reason 
for my Turing Award. The ACM an- 
nouncement of the Turing Award for 


Dana and for me mentioned the work | 


on finite automata and other work we 
did and also suggested that I was the 
first person to study what is now called 
complexity of computations. 


Randomized Algorithms: 
A New Departure 


I went back to Jerusalem. I divided my | 


research between working on logic, 
mainly model theory, and working on 
the foundations of what is now com- 
puter science. I was an associate pro- 
fessor and the head of the Institute 
of Mathematics at 29 years old and 
a full professor by 33, but that was 
completely on the merit of my work in 
logic and in algebra. There was abso- 
lutely no appreciation of the work on 
the issues of computing. Mathemati- 
cians did not recognize the emerging 
new field. 

In 1960, I was invited by E.F. Moore 
to work at Bell Labs, where I intro- 


duced the construct of probabilistic 


automata. These are automata that 
employ coin tosses in order to de- 
cide which state transitions to take. I 
showed examples of regular languag- 
es that required a very large number 
of states, but for which you get an ex- 
ponential reduction of the number of 
states if you go over to probabilistic 
automata. 
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Shasha: And with some kind of er- 
ror bound? 
Rabin: Yes, yes, that’s right. In 


| other words, you get the answer, but 


depending upon how many times you 
run the probabilistic automaton, you 
have a very small probability of error. 
That paper eventually got published 
in 1963 in Information and Control. 

In 1975, I finished my tenure as 
Rector (academic head) of the Hebrew 
University of Jerusalem and came to 
MIT as a visiting professor. Gary Mill- 
er was there and had his polynomial 


| time test for primality based on the 


extended Riemann hypothesis. [Given 
an integer n, a test for primality deter- 


_ mines whether 7 is prime.] That test 


was deterministic, but it depended 
on an unproven assumption. With the 
idea of using probability and allowing 
the possibility of error, I took his test 
and made it into what’s now called a 
randomized algorithm, which today 
is the most efficient test for primality. 
I published first, but found out that 
Robert Solovay and Volker Strassen 
were somewhat ahead with a different 
test. My test is about eight times faster 
than theirs and is what is now univer- 
sally being used. In the paper I also 
introduced the distinction between 
what are now called Monte-Carlo and 
Las-Vegas algorithms. 

In early 1976 I was invited by Joe 
Traub fora meeting at CMU and gave a 
talk presenting the primality test. After 
I gave that lecture, people were stand- 
ing around me, and saying, “This is re- 
ally very beautiful, but the new idea of 
doing something with a probability of 
error, however exponentially small, is 
very specialized. This business of wit- 
nesses for compostiness you have in- 
troduced is only useful for the primal- 
ity test. It will never really be a widely 
used method.” Only Joe Traub said, 
“No, no, this is revolutionary, and it’s 
going to become very important.” 


From Trick to 

Fundamental Technique 

From then on, I set myself the goal of 
finding more and more applications 
of randomization in mathematics and 
computer science. For example, in 
1977 in my MIT class, I presented an 
algorithm for efficiently expressing 
an integer as the sum of four squares. 
Lagrange had proved in 1770 that ev- 
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| ery integer can be so expressed, but | 


there was no efficient algorithm for 
doing it. In 1977 I found an efficient 
randomized algorithm for doing that 
computation. That algorithm later ap- 
peared in ajoint paper with Jeff Shallit 
in 1986 together with additional appli- 
cations of randomization to number 
theoretical algorithms. Later, I turned 
my attention to distributed algorithms 
and found an approach using a ran- 
dom shared bit for solving Byzantine 
Agreement far more efficiently than 
previous approaches. Still later I ap- 
plied randomization to asynchronous 
fault-tolerant parallel computationsin 
collaboration with Jonatan Aumann, 


| Zvi Kedem, and Krishna Palem. 


Right now, if you look at STOC 
and FOCS [the major conferences in 
theoretical computer science] maybe 


| a third to half of the papers are built 


around randomized algorithms. And 
of course you’ve got the wonderful 
book Randomized Algorithms by Rajeev 


| Motwani and Prabhaker Raghavan. 


Shasha: Let me back up and talk a 
little bit about the general uses of ran- 
domness in computer science. There 
seem to be at least three streams of 
use of randomness—yours, the use 
in communication (for example, ex- 


| ponential back off in the Ethernet 


protocol), and the use in genetic algo- 
rithms where random mutations and 
random recombination sometimes 
lead to good solutions of combinato- 
rial problems. Do you see any unified 
theme among all those three? 

Rabin: I would say the following: 
The use in the Ethernet protocol is in 
some sense like the use in Byzantine 
agreement. In Byzantine agreement, 


| the parties want to reach agreement 


against improper or malicious oppo- 


_ nents. In the Ethernet protocol, the 


More recently, | have 
become interested 
in protecting the 
privacy and secrecy 
of auctions. 
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participants want to avoid clashes, 
conflicts. That’s somewhat similar. I 
don’t know enough about genetic al- 
gorithms, but they are of the same na- 
ture as the general randomized algo- 
rithms. I must admit that after many 
years of work in this area, the efficacy 
of randomness for so many algorith- 
mic problems is absolutely mysterious 
to me. It is efficient, it works; but why 
and how is absolutely mysterious. 

It is also mysterious in another way 
because we cannot really prove that 
any process, even let’s say radioac- 
tive decay, is truly random. Einstein 
rejected the basic tenets of Quantum 


| Theory and said that God does not 


play dice with the universe. Random- 
ized algorithms, in their pure form, 
must use a physical source of random- 
ness. So it is cooperation between us 
as computer scientists and nature as 
a source of randomness. This is re- 
ally quite unique and touches on deep 
questions in physics and philosophy. 


Tree Automata 

Let me return to a chapter of my work 
that I skipped before. After the work 
on finite automata by Dana Scott and 
me, two mathematicians, Richard Bu- 
chi and Calvin Elgot, discovered how 
the theory of finite automata could 
be used to solve decision problems in 
mathematical logic. They showed that 
the so-called Pressburger arithmetic 
decision problem could be solved by 
use of finite automata. Then Buchi 
went on to generalize finite automata 
on finite sequences to finite automata 
on infinite sequences, a very brilliant 
piece of work that he presented at 
the Congress on Logic, Philosophy, 
and Methodology in Science in 1960. 
In that paper, he showed that the so- 
called monadic second-order theory 
of one successor function is decid- 
able. Let me explain very briefly what 
that means. 

We have the integers, 0, 1, 2, 3. The 
successor function is defined as S(x) = 
x+1. The classical decision problems 
pertain to problems formulated with- 
in a predicate logic—the logic of rela- 
tions and functions with the quantifi- 
ers of “exists” x and “for all” x, and the 
logical connectives. Monadic second- 
order theory means that you quantify 


_ over sets. Buchi demonstrated that 
' the monadic second-order theory of 


one successor function—where you 
are allowed to quantify over arbitrary 
subsets of integers—is decidable. His 
was the first result of that nature. 

Buchi posed an open problem: 
is the monadic second-order theory 
decidable if you have two successor 
functions. For example, one successor 
function being 2x (double), and the 
other one being 2x+1. I don’t know if 
he realized how powerful the monad- 
ic theory of two successor functions 
would turn out to be. I realized that 
this is an incredibly powerful theory. 
If you find a decision procedure for 
that theory, then many other logical 
theories can be demonstrated to be 
decidable. 


I set myself to work on this prob- | 


lem. My formulation was a generaliza- 
tion from infinite strings to infinite bi- 
nary trees. You consider a tree where 
you have a root, and the root has two 
children, a child on the left anda right 
child, and each of these has again two 
children, a left child and a right child. 
The tree branches out ad infinitum, 
forming an infinite binary tree. Con- 
sider that tree with the two successor 
functions, left child, right child, and 
study the logical theory of the tree with 
these two functions and quantifica- 
tion over arbitrary subsets of nodes of 
the tree. 

In 1966, I came as visitor to IBM 
research at Yorktown Heights, and 
one of my goals to find an appropriate 
theory of automata on infinite binary 
trees and prove the decidability of the 
same problems that Dana Scott and I 
had shown to be decidable for finite 
automata on finite strings. I created 
the appropriate theory of automata 
on these infinite trees and showed 
that it was decidable. I consider this 
to be the most difficult research I 
have ever done. 

A remarkable feature of that origi- 
nal proof is that even though we are 
dealing with finite automata, and with 
trees that are infinite but countable, 
the proof and all subsequent variants 
employ transfinite induction up to the 
first uncountable ordinal. Thus the 
proof is a strange marriage between 
the finite and countable with the un- 
countable. 

This theory led to decision algo- 
rithms for many logical theories. That 
included decidability of nonstandard 


ar 
Great teaching 

and great science 
really flow together 
and are not mutually 
contradictory 

or exclusive of 

each other. 


logics like modal logics and the tem- 
poral logics that are a tool for program 
verification, especially in the work of 
Amir Pnueli, Moshe Vardi, Orna Kup- 
ferman, and many others. 


Keeping Secrets 
The next significant application of 
my work on randomization was to 
cryptography. Ueli Mauer suggested a 
model ofacryptographic system that is 
based on what he called the bounded 
storage assumption. Namely, you have 
a very intense public source of ran- 
dom bits that everybody can observe, 
say beamed down from a satellite. 
The sender and receiver have a short 
common key that they establish say by 
meeting, and they use that key in order 
to select the same random bits out of 
the public source of randomness. Out 
of those bits, they construct so-called 
one-time pads. If you assume that the 
intensity of the random source is so 
large that no adversary can store more 
than, let us say, two-thirds of its bits, 
then the one-time pad is really com- 
pletely random to the adversary and 
can be used for provably unbreakable 
encryption. 

Mauer initially proved the unbreak- 
ability result under the assumption 
that the adversary stores original bits 


| from the random source. However, 


there remained an open question: 
suppose your adversary could do some 
operations on the original bits and 
then store fewer bits than the num- 
ber of source bits. Jonatan Aumann, 
Yan Zong Ding, and I showed that 
even if the adversary is computation- 
ally unbounded, one can still obtain 
unbreakable codes provided the ad- 
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versary cannot store all his computed 
bits. This work created quite a stir and 
was widely discussed, even in the pop- 
ular press. 

Nowadays however, the bounded 
storage assumption may not be com- 
pelling because the capacity of stor- 
age has increased so dramatically. 
So I posited a new Limited Access 
Model. Suppose each of 10,000 par- 
ticipants independently store physi- 
cally random pages. The sender and 
receiver use a common small ran- 
dom key to randomly select the same 
30 page server nodes and from each 
node download the same randomly 
| selected page. Sender and receiver 
| now XOR those 30 pages to create a 
| one-time pad they employ to encrypt 
| messages. Assume that an adversary 
| cannot listen to or subvert more than 
say 2,000 of the 10,000 page server 
nodes. Consequently, his probability 
of having obtained any particular ran- 
dom page is no more than 1/5, and, 
since the pages are XORed in groups 
of 30, his probability of having all of 
those pages is 1/5 to the power of 30. 
If the adversary is missing even one 
of those pages, then the one-time pad 
is completely random with respect to 
him, and consequently, if used to en- 
crypt by XORing with the message, the 
encrypted message is also completely 
random for that adversary. 


Privacy for Pirates and Bidders 

| In the late 1990s, [Dennis Shasha] 
came to meand suggested that we work 
together on devising a system for pre- 
venting piracy, to begin with, of soft- 
ware, but later on also music, videos, 
and so on—any kind of digitized intel- 
lectual property. We started by rein- 
venting variants of existing methods, 
such as the use of safe co-processors 
and other methods that were actually 
current at the time and which, by the 
way, have all either been defeated or 
impose excessive constraints on use. 
We then invented a new solution. Our 
design protects privacy of legitimate 
purchasers and even of pirates while 
at the same time preventing piracy of 
protected digitized content. For ex- 
ample, an engineering firm wanting 
to win a contest to build a bridge can 
purchase software for bridge design 
without identifying itself. Thus com- 
petitors cannot find out that the firm 
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is competing for the project. The pur- 
chaser of digital content obtains a tag 
permitting use of the content. The tag 
is not tied to a machine identifier so 
that the tag and use of the content can 
be moved from machine to machine. 
Coupled with robust content-identify- 
ing software, use of protected content 
absent a corresponding tag is stopped. 
The deployment of this robust solu- 


tion to the piracy problem requires | 


the cooperation of system vendors. 
More recently, I have become in- 
terested in protecting the privacy and 
secrecy of auctions. Working with Stu- 
art Shieber, David Parks, and Chris 
Thorpe, we have a methodology that 
employs cryptography to ensure hon- 
est privacy-preserving auctions. Our 
protocols enable an auctioneer to con- 
duct the auction in a way that is clearly 
honest, prevents various subversions 
of the auction process itself, and later 
on, when the auction is completed 
and the auctioneer has determined 
the winner or winners and how much 
they pay and how much they get of 
whatever is being auctioned in multi- 
item auctions, the auctioneer can 
publish a privacy-preserving proof for 
the correctness of the result. In the ini- 
tial papers, we used the tool of homo- 
morphic encryption. Later on, I had 
an idea that I then eventually imple- 
mented with Chris Thorpe and Rocco 


Servedio of an entirely new approach | 


to zero knowledge proofs, which is 
computationally very efficient, does 
not use heavy-handed and compu- 
tationally expensive encryption, and 
achieves everything very efficiently by 
use of just computationally efficient 
hash functions. 


Teachers, Teaching, and Research 
In the life of every creative scientist, 
you will find outstanding teachers 
that have influenced him or her and 
directly or indirectly played a role in 
their success. My first mentor was a 
very eminent logician and set theo- 
rist, Abraham Halevi Fraenkel. I wrote 
a master’s thesis under the direction 


of the wonderful algebraist Jacob Lev- | 


itski. He was a student of maybe the 
greatest woman mathematician ever, 
Emmy Noether, the creator of modern 
abstract algebra as we know it. In Je- 
rusalem, I learned applied mathemat- 
ics from Menachem Schiffer, a great 
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mathematician and a great teacher. 

Shasha: Let’s talk a little bit about 
the relationship that you’ve men- 
tioned to me often between teaching 
and research, because you’ve won 
many awards for teaching as well as 
for science. How do you think aca- 
demics should view their teaching re- 
sponsibilities? 

Rabin: This is reallyaveryimportant 
question. There is this misconception 
that there is a conflict and maybe even 
a contradiction between great teach- 
ing and being able to do great science. 
I think this is completely incorrect, 
and that wonderful teaching, such as 
Schiffer’s teaching, flows from a deep 
understanding of the subject matter. 
This is what enables the person also 
to correctly select what to teach. Af- 


| ter all, the world of knowledge, even | 


in specialized subjects, is almost in- 


program computer. Next there was a 
great emphasis on the study of vari- 
ous mathematical machines. Finite 
automata are models for sequential 
circuits. Nondeterministic and deter- 
ministic so-called pushdown autom- 
ata play a pivotal role in the syntactic 
compilation of programming languag- 
es. For about 20 years or so, there was a 
great emphasis in research on autom- 
ata, programming languages, and for- 
mal languages, also in connection of 
course with linguistics. There was also 
a considerable emphasis on efficient 


_ algorithms of various kinds. The book 


finite. So one great contribution is to | 


select the topics, and the other great 
contribution is to really understand 
the essence, the main motifs, of each 
particular topic that the person pres- 
ents, and to show it to the class in a 
way that the class gets these essential 
ideas. Great teaching and great sci- 
ence really flow together and are not 
mutually contradictory or exclusive of 
each other. 


The Future 

Shasha: You have contributed to so 
many areas of computer science. Talk 
about one or many, and where do you 
think it is going? 

Rabin: Computer science research 
has undergone a number of evolutions 
during my research career, and be- 
cause of the quick pace, one can even 
say revolutions. To begin with, there 


was Alan Turing’s model of a comput- | 


ing machine, leading to the stored- 
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by Alfred Aho, John Hopcroft, and Jeff 
Ullman, and Donald Knuth’ classical 
books are examples of this strong in- 
terest in algorithms. The study of al- 
gorithms will always remain centrally 
important. Powerful algorithms are 
enabling tools for every computer in- 
novation and application. 

Then emphasis started to shift to- 
ward issues of networking and com- 
munication. Even the people who 
created the Internet, email, and other 
forms of connectivity perhaps didn’t 
fully realize where all that was going 
to lead. This evolution, revolution, ex- 
plosion, started to accelerate, and over 
the past 10 years we went into a phase 
where the main use of computers is 
in the world of information creation, 
sharing, and dissemination. We have 


| search engines, Wikipedia, Facebook, 


blogs, and many other information re- 
sources, available to everyone literally 
at their fingertips. 

We are only at the beginning of this 
revolution. One can predict that there 
is going to be an all-encompassing 
worldwide network of knowledge 
where people are going to create, 
share, and use information in ways 
that never existed before, and which 
we can’t fully foresee now. These de- 
velopments are giving rise to a torrent 
of research in information organiza- 
tion and retrieval, in machine learn- 
ing, in computerized language under- 
standing, in image understanding, 
in cryptography security and privacy 
protection, in multi-agent systems, to 
name just a few fields. Computer sci- 
ence research will continue to be rich, 
dynamic, exciting, and centrally im- 
portant for decades to come. 
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THE RATE AT which power-management features 

have evolved is nothing short of amazing. Today 
almost every size and class of computer system, from 
the smallest sensors and handheld devices to the 

“big iron” servers in data centers, offer a myriad of 
features for reducing, metering, and capping power 
consumption. Without these features, fan noise would 
dominate the office ambience and untethered laptops 
would remain usable for only a few short hours (and 
only then if one could handle the heat), while data- 
center power and cooling costs and capacity would 
become unmanageable. 

As much as we might think of power-management 
features as being synonymous with hardware, 
software’s role in the efficiency of the overall system 
has become undeniable. Although the notion of 
“software power efficiency” may seem justifiably 
strange (as software doesn’t directly consume power), 
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the salient part is really the way in 
which software interacts with power- 
consuming system resources. 

Let’s begin by classifying software 
into two familiar ecosystem roles: 
resource managers (producers) and 
resource requesters (consumers). We 
will then examine how each can con- 
tribute to (or undermine) overall sys- 
tem efficiency. 

The history of power management is 
rooted in the small systems and mobile 


| space. By today’s standards, these sys- 


tems were relatively simple, possessing 
a small number of components, such 
as a single-core CPU and perhaps a disk 


| that could be spun down. Because these 


systems had few resources, utilization 
in practice was fairly binary in nature, 
with the system’s resources either be- 


| ing in use—or not. As such, the strategy 


for power managing resources could 
also be fairly simple, yet effective. 

For example, a daemon might pe- 
riodically monitor system utilization 


| and, after appearing sufficiently idle 


for some time threshold, clock down 
the CPU’s frequency and spin down 
the disk. This could all be done in a way 
that required little or no integration 
with the subsystems otherwise respon- 
sible for resource management (for 
example, the scheduler, file system, 
among others), because at zero utiliza- 
tion, not much resource management 
needed to be done. 

By comparison, the topology of mod- 
ern systems is far more complex. As 
the “free performance lunch” of ever- 
increasing CPU clock speeds has come 
to an end, the multicore revolution is 
upon us, and as a consequence, even 
the smallest portable devices present 
multiple logical CPUs that need to be 
managed. As these systems scale larger 
(presenting more power-manageable 


resources), partial utilization becomes 


more common where only part of the 
system is busy while the rest is idle. 
Of course, CPUs present just one ex- 


| ample of a power-manageable system 


resource: portions of physical memory 
may (soon) be power manageable, with 
the same being true for storage and 


1/O devices. In the larger data-center 
context, the system itself might be the 
power-manageable resource. 

Effective resource management on 
modern systems requires that there 
be at least some level of resource man- 


ager awareness of the heterogeneity | 


brought on by varying resource power 
states and, if possible, some exploita- 
tion of it. (Actually, effective resource 
management requires awareness of 
resource heterogeneity in general, with 


varying power states being one way in 


which that resource heterogeneity can 
arise.) Depending on what is being 


managed, the considerations could be 
spatial, temporal, or both. 


Spatial Considerations 

Spatial considerations involve decid- 
ing which resources to provision in 
response to a consumer’s request in 
time. For an operating-system thread 
scheduler/dispatcher, this might de- 
termine to which CPUs _ runnable 
threads are dispatched, as well as the 


overall optimal distribution pattern of | 


threads across the system’s physical 
processor(s) to meet some policy ob- 
jective (performance, power efficiency, 
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and so on). For the virtual memory 
subsystem, the same would be true for 
how physical memoty is used; for a file 
system/volume manager, the block al- 
location strategy across disks; and ina 
data center, how virtual machines are 
placed across physical systems. These 
different types of resource managers 
are shown in Figure 1. 

One such spatial consideration is 
the current power state of available 
resources. In some sense, a resource’s 
power states can be said to represent 
a set of trade-offs. Some states provide 
a mechanism allowing the system to 
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Figure 1. A hierarchy 
of resource managers. 
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trade off performance for power effi- 
ciency (CPU frequency scaling is one 
example), while others might offer (for 
idle resources) a trade-off of reduced 
power consumption versus increased 
recovery latency (for example, as with 
the ACPI C-states). As such, the act of 
a resource manager selecting one re- | 


source over another (based on power | 


states) is an important vehicle for mak- 


ing such tradeoffs that ideally should | 


complement the power-management 
strategy for individual resources. 

The granularity with which resourc- 
es can be power managed is another 
important spatial consideration. If 
multicore processors can be power 
managed only at the socket level, then 
there’s good motivation to consolidate 
system load on as few sockets as possi- 
ble. Consolidation drives up utilization 
across some resources, while quiesc- 
ing others. This enables the quiesced 
resources to be power managed while 
“directing” power (and performance) 
to the utilized portion of the system. 

Another factor that may play into 
individual resource selection and uti- 
lization distribution decisions are the 
characteristics of the workload(s) us- 
ing the resources. This may dictate, for 
example, how aggressively a resource 
manager can consolidate utilization 
across the system without negatively 
impacting performance (as a result of 
resource contention) or to what extent 
changing a utilized resource’s power 
state will impact the consumer’s per- 
formance. 


Temporal Considerations 

Some resource managers may also al- 
locate resources in time, as well as 
(or rather than) space. For example, a 
timer subsystem might allow clients 
to schedule some processing at some 
point (or with some interval) in the fu- 


ture, or a task queue subsystem might | 


provide a means for asynchronous 
or deferred execution. The interfaces 
to such subsystems have tradition- 
ally been very narrow and prescriptive, 
leaving little room for temporal opti- 
mization. One solution is to provide 


Figure 2. Benefits of batch processing periodic timers. 
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interfaces to clients that are more de- 
scriptive in nature. For example, rather 
than providing a narrow interface for 
precise specification of what should 
happen and when: 


int 
schedule _ timer((void)* 
what(), time _t when); 


atimer interface might instead spec- 
ify what needs to be done along with a 
description of the constraints for when 
it needs to happen: 


int 
schedule _ timer((void)* 
what(), time t about when, 
time t déferrable by, 
time t advancable _ by); 


Analogous to consolidating load 
onto fewer sockets to improve spa- 
tial resource quiescence, providing 
some temporal latitude allows the 
timer subsystem to consolidate and 
batch process expirations. So rather 
than waking up a CPU n times over a 
given time interval to process n timers 
(incurring some overhead with each 
wakeup), the timer subsystem could 
wake the CPU once and batch process 
all the timers allowable per the (more 
relaxed) constraints, thus reducing 
CPU overhead time and increasing 
power-managed state residency (see 
Figure 2). 


Efficient Resource Consumption 
Clearly, resource managers can con- 
tribute much to the overall efficiency 
of the system, but ultimately they are 
forced to work within the constraints 
and requests put forth by the system’s 
resource consumers. Where the con- 
straints are excessive and resources are 
overallocated or not efficiently used, 
the benefits of even the most sophisti- 
cated power-management features can 
be for naught while the efficiency of the 
entire system stack is compromised. 
Well-designed, efficient software is 
a thing of beauty showing good pro- 
portionality between utilization (and 
useful work done) and the amount of 
resources consumed. For utopian soft- 
ware, such proportionality would be 
perfect, demonstrating that when no 
work is done, zero resources are used; 
and as resource utilization scales high- 


er, the amount of work done scales 
similarly (see Figure 3). 

Real software is not utopian, though, 
and the only way to have software con- 
sume zero resources is not to run it at 
all. Even running very well-behaved 
software at the minimum will, in prac- 
tice, require some resource overhead. 

By contrast, inefficient software 
demonstrates poor proportionality be- 
tween resource utilization and amount 
of work done. Here are some common 
examples: 

» A process is waiting for something, 
such as the satisfying of a condition, 
and is using a timer to periodically 
wake up to check if the condition has 
been satisfied. No useful work is being 
done as it waits, but each time it wakes 
up to check, the CPU is forced to leave 
an idle power-managed state. What’s 
worse, the process has decided to wake 
up with high frequency to “minimize 
latency” (see Figure 4). 

»An application uses multiple 
threads to improve concurrency and 
scale throughput. It blindly creates as 
many threads as there are CPUs on the 
system, even though the application 
is unable to scale beyond a handful of 
threads because of an internal bottle- 
neck. Having more threads means 
more CPUs must be awakened to run 
them, despite little to no marginal con- 
tribution to performance with each ad- 
ditional thread (see Figure 5). 

» Aservice slowly leaks memory, and 
over time its heap grows to consume 
much of the system’s physical memory, 
despite little to none of it actually being 
needed. As a consequence, little oppor- 
tunity exists to power manage memory 
since most of it has been allocated. 


Observing Inefficiency in 

the Software Ecosystem 

Comprehensive analysis of software ef- 
ficiency requires the ability to observe 
the proportionality of resource utili- 
zation versus useful work performed. 
Of course, the metric for “work done” 
is inherently workload specific. Some 
workloads (such as Web servers and 
databases) might be throughput based. 
For such workloads, one technique 
could be to plot throughput (for ex- 
ample, transactions per second) versus 
{cpu|memory|bandwidth|storage} re- 
source consumption. Where a “knee” 
in the curve exists (resource utilization 


Figure 3. Good efficiency. 
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Figure 4. “Idle” inefficiency. 
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Figure 5. Scaling inefficiency. 


Work Done 


rises, yet throughput does not), there 
is an opportunity either to use fewer 


resources to do the same work or per- | 


haps to eliminate a bottleneck to facili- 
tate doing more work using the same 
resources. 

For parallel computation workloads 
that use concurrency to speed up pro- 
cessing, one could plot elapsed com- 
putation time, versus the resources 
consumed, and using a similar tech- 
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% 


Inefficiency 


Resource Utilization 


nique identify and avoid the point of 
diminishing returns. 

Rather than using workload-specific 
analysis, another fruitful technique is 
looking at systemwide resource utiliza- 
tion behavior at what should be zero uti- 
lization (system idle). By definition, the 
system isn’t doing anything useful, so 
any software that is actively consuming 
CPU cycles is instantly suspect. Power- 
TOP is an open source utility developed 
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by Intel specifically to support this 
methodology of analysis (see Figure 6). 
Running the tool on what should be an 
otherwise idle system, one would expect 
ideally that the system’s processors are 
power managed 100% of the time, but 
in practice, inefficient software (usually 
doing periodic time-based polling) will 
keep CPUs fractionally busy. PowerTOP 
shows the extent of the waste, while 
also showing which software is respon- 
sible. System users can then report the 
observed waste as bugs and/or elect to 
run more efficient software. 


Designing Efficient Software 
Efficiency as a design and optimization 
point for software might at first seem 
a bit foreign, so let’s compare it with 
some others that are arguably more es- 
tablished: performance and scalability. 

> Well-performing software maxi- 
mizes the amount of useful work done 
(or minimizes the time taken to do it), 
given a fixed set of resources. 

> Scalable software will demonstrate 
that performance proportionally in- 
creases as more resources are used. 

Efficient software can be said to be 
both well performing and scalable, 


but with some additional constraints | 


around resource utilization. 

> Given a fixed level of performance 
(amount of useful work done oramount 
of time taken to do it), software uses the 
minimal set of resources required. 

>» As performance decreases, re- 
source utilization proportionally de- 
creases. 

This implies that in addition to 
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looking at the quantity and propor- 
tionality of performance given the re- 
source utilization, to capture efficiency, 
software designers also need to con- 
sider the quantity and proportionality 
of resource utilization given the perfor- 


mance. If all this seems too abstract, | 


here are some more concrete factors 
to keep in mind: 

> When designing software that will 
be procuring its own resources, ensure 


it understands what resources are re- | 
quired to get the job done and yields | 


them back when not needed to facili- 
tate idle resource power management. 
If the procured resources will be need- 
ed on an intermittent basis, have the 
software try to leverage features that 
provide hints to the resource manager 
about when resources are (and are not) 
being used at least to facilitate active 
power management. 

With respect to CPU utilization: 

> When threads are waiting for some 
condition, try to leverage an event-trig- 


| gered scheme to eliminate the need for 
time-based polling. Don’t write “are we | 


there yet?” software. 

> If this isn’t possible, try to poll in- 
frequently. 

> If it can’t be eliminated, try at least 


| to ensure that all periodic/polling activ- 


ity is batch processed. Leverage timer 
subsystem features that provide lati- 
tude for optimization, such as coarsen- 
ing resolution or allowing for timer ad- 
vance/deferral. 

With respect to memory utilization: 

> Watch for memory leaks. 

> Free or unmap memory that is 
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| no longer needed. Some operating 


systems provide advisory interfaces 
around memory utilization, such as 
madvise(3c) under Solaris. 

With respect to I/O utilization 

> If possible, buffer/batch I/O re- 


| quests. 


Driving Toward an Efficient 

System Stack 

Every so often, evolution and innova- 
tion in hardware design brings about 
new opportunities and challenges for 
software. Features to reduce power 
consumption of underutilized system 
resources have become pervasive in 
even the largest systems, and the soft- 
ware layers responsible for managing 
those resources must evolve in turn— 
implementing policies that drive per- 
formance for utilized resources while 
reducing power for those that are un- 
derutilized. 

Beyond the resource managers, re- 
source consumers clearly have a signif- 
icant opportunity either to contribute 
to or undermine the efficiency of the 
broader stack. Though getting pro- 
grammers to think differently about 
the way they design software is more 
than a technical problem, tools such 
as PowerTOP represent a great first 
step by providing programmers and 
administrators with observability into 
software inefficiency, a point of refer- 
ence for optimization, and awareness 
of the important role software plays in 
energy-efficient computing. 
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Contention for caches, memory controllers, 
| and interconnects can be eased by 
contention-aware scheduling algorithms. 


_ BY ALEXANDRA FEDOROVA, SERGEY BLAGODUROV, 
| AND SERGEY ZHURAVLEV 


Managing 
Contention 
for Shared 


Resources 
on Multicore 
Processors 


Cs (last-level caches; for exar 


), memory controllers, and interco! 


ters as memory domains beca 


shared resources mostly have to do | their performance relative to what they 
with the memory hierarchy. Figure 1 | could achieve running in a contention- 
provides an illustration of a system | free environment. Consider an_ ex- 
with two memory domains and two | ample demonstrating how contention 
cores per domain. for shared resources can affect appli- 

Threads running on cores in the | cation performance. In this example, 
same memory domain may compete | four applications—Soplex, Sphinx, 
for the shared resources, and this | Gamess, and Namd, from the Standard 
contention can significantly degrade | Performance Evaluation Corporation 
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(SPEC) CPU 2006 benchmark suite’— 
run simultaneously on an Intel Quad- 
Core Xeon system similar to the one 
depicted in Figure 1. 

As a test, we ran this group of appli- 
cations several times, in three different 
schedules, each time with two different 
pairings sharing a memory domain. 
The three pairing permutations af- 
forded each application an opportu- 
nity to run with each of the other three 
applications within the same memory 
domain: 


Figure 1. A schematic view of a multicore syste 
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the architecture of Intel Quad-Core Xeon processors. 


> Soplex and Sphinx ran in a mem- 
ory domain, while Gamess and Namd 
shared another memory domain. 

» Sphinx was paired with Gamess, 
while Soplex shared a domain with 
Namd. 

» Sphinx was paired with Namd, 
while Soplex ran in the same domain 
with Gamess. 

Figure 2 contrasts the best perfor- 
mance of each application with its 
worst performance. The performance 


| levels are indicated in terms of the 


m with two memory domains representing 
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Figure 2. Percentage of performance degradation over a solo run achieved in 


two different scheduling assignments: the best 
the better the performance. 


and the worst. The lower the bar, 
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percentage of degradation from solo 
execution time (when the application 
ran alone on the system), meaning that 
the lower the numbers, the better the 
performance. 

There is a dramatic difference be- 
tween the best and the worst schedules, 
as shown in the figure. The workload as 
a whole performed 20% better with the 
best schedule, while gains for individu- 
al applications Soplex and Sphinx were 
as great as 50%. This indicates a clear 
incentive for assigning applications to 
cores according to the best possible 
schedule. While a contention-oblivi- 
ous scheduler might accidentally hap- 
pen upon the best schedule, it could 
just as well run the worst schedule. A 
contention-aware schedule, on the oth- 
er hand, would be better positioned to 
choose a schedule that performs well. 

This article describes an investiga- 
tion of a thread scheduler that would 
mitigate resource contention on mul- 
ticore processors. Although we began 
this investigation using an analytical 
modeling approach that would be dif- 
ficult to implement online, we ulti- 
mately arrived at a scheduling method 
that can be easily implemented online 
with a modern operating system or 
even prototyped at the user level. To 
share a complete understanding of the 
problem, we describe both the offline 
and online modeling approaches. The 
article concludes with some actual per- 
formance data that shows the impact 
contention-aware scheduling tech- 
niques can have on the performance 
of applications running on currently 
available multicore systems. 

To make this study tractable we 
made the assumption that the threads 
do not share any data (that is, they be- 
long either to different applications 
or to the same application where each 
thread works on its own data set). If 
threads share data, they may actually 


Figure 3. Mcf (a) is an application with a rather poor temporal locality, hence the low reuse frequency and high miss frequency. 


Povray (b) has excellent temporal locality. Milc (c) rarely reuses its data, therefore showing a very low reuse frequency and 


a very high miss frequency. 
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benefit from running in the same do- 
main. In that case, they may access the 
shared resources cooperatively; for ex- 
ample, prefetch the data for each other 
into the cache. While the effects of co- 
operative sharing must be factored into 
a good thread-placement algorithm 


for multicores, this subject has been | 
explored elsewhere.” The focus here is | 


managing resource contention. 


Understanding 

Resource Contention 

To build a contention-aware scheduler, 
we must first understand how to model 
contention for shared resources. Mod- 
eling allows us to predict whether a 
particular group of threads is likely 
to compete for shared resources and 
to what extent. Most of the academic 
work in this area has focused on mod- 
eling contention for LLCs, as this was 
believed to have the greatest effect on 


performance. This is where we started 


a scheduler that avoids cache conten- 
tion, however, we needed to find ways 
to predict contention. 

There are two schools of thought 
regarding the modeling of cache con- 
tention. The first suggests that consid- 
ering the LLC miss rate of the threads 
is a good way to predict whether these 
threads are likely to compete for the 
cache (the miss rate under contention 
is an equally good heuristic as the solo 
miss rate). A miss rate is the number 
of times per instruction when a thread 
fails to find an item in the LLC and so 
must fetch it from memory. The rea- 
soning is that if a thread issues lots of 
cache misses, it must have a large cache 
working set, since each miss results in 
the allocation of a new cache line. This 
way of thinking also maintains that any 


| thread that has a large cache working 


set must suffer from contention (since 
it values the space in the cache) while 
inflicting contention on others. 


practice 


dicates the miss frequency (how often 
that application misses in the cache). 
The reuse-frequency histogram shows 
the locality of reused data. Each bar 
represents a range of reuse distances— 
these can be thought of as the number 
of time steps that have passed since the 
reused data was last touched. An appli- 
cation with a very good temporal locali- 
ty will have many high bars to the left of 
the histogram (Figure 3b), as it would 
reuse its data almost immediately. This 
particular application also has negli- 


| gible miss frequency. An application 


with a poor temporal locality will have 
a flatter histogram and a rather high 
miss frequency (Figure 3a). Finally, an 
application that hardly ever reuses its 
data will result in a histogram indicat- 
ing negligible reuse frequency and a 
very high miss frequency (Figure 3c). 
Memory-reuse profiles have been 
used in the past to effectively model 
the contention between threads that 


our investigation as well. 

Cache contention occurs when two 
or more threads are assigned to run on 
the cores of the same memory domain 
(for example, Core 0 and Core 1 in Fig- 
ure 1). In this case, the threads share 
the LLC. A cache consists of cache lines 
that are allocated to hold the memory 
of threads as the threads issue cache 
requests. When a thread requests a line 
that is not in the cache—that is, when it 
issues a cache miss—a new cache line 
must be allocated. The issue here is 
that when a cache line must be allocat- 
ed but the cache is full (which is to say 
whenever all the other cache lines are 


being used to hold other data), some | 
_ and they have demonstrated that this 


data must be evicted to free up a line 
for the new piece of data. The evicted 
line might belong to a different thread 
from the one that issued the cache 


miss (modern CPUs do not assure any 


fairness in that regard), so an aggres- 
sive thread might end up evicting data 
for some other thread and thus hurting 
its performance. 

Although several researchers have 
proposed hardware mechanisms for 
mitigating LLC contention,*’ to the 
best of our knowledge these have not 


been implemented in any currently | 


available systems. We therefore looked 
for a way to address contention in the 
systems that people are running now 
and ended up turning our attention to 
scheduling as a result. Before building 


This proposal is contradicted by fol- 
lowers of the second school of thought, 


who reason that if a thread hardly ever | 


reuses its cached data—as would be the 
case witha video-streaming application 
that touches the data only once—it will 
not suffer from contention even if it 
brings lots of data into the cache. That 
is because such a thread needs very lit- 
tle space to keep in cache the data that 
it actively uses. This school of thought 
advocates that to model cache conten- 
tion one must consider the memory- 
reuse pattern of the thread. Followers 
of this approach therefore created sev- 
eral models for shared cache conten- 
tion based on memory-reuse patterns, 


approach predicts the extent of con- 
tention quite accurately.’ On the other 
hand, only limited experimental data 
exists to support the plausibility of the 
other approach to modeling cache con- 
tention.® Given the stronger evidence in 
favor of the memory-reuse approach, 
we built our prediction model based 
on that method. 

A memory-reuse pattern is captured 
by a memory-reuse profile, also known 
as the stack-distance’ or reuse-distance 
profile.! Figure 3 shows examples of 
memory-reuse profiles belonging to ap- 
plications in the SPEC CPU2006 suite. 
The red bars on the left show the reuse 
frequency (how often the data is re- 


share a cache.* These models, the de- 
tails of which we omit from this article, 
are based on the shape of memory-re- 
use profiles. One such model, the SDC 
(stack distance competition) examines 
the reuse frequency of threads sharing 
the cache to determine which of the 
threads is likely to “win” more cache 
space; the winning thread is usually 
the one with the highest overall reuse 
frequency. Still, the SDC model, along 
with all the other models based on 
memory-reuse profiles, was deemed 
too complex for our purposes. After 
all, our goal was to use a model in an 
operating-system scheduler, mean- 
ing it needed to be both efficient and 
lightweight. Furthermore, we were 
interested in finding methods for ap- 
proximating the sort of information 
memory-reuse profiles typically afford 
using just the data that’s available at 
runtime, since memory-reuse profiles 


_ themselves are very difficult to obtain 


at runtime. Methods for obtaining 
these profiles online require uncon- 
ventional hardware’ or rely on hard- 


| ware performance counters available 


only on select systems.® 

Our goal, therefore, was to capture 
the essence of memory-reuse profiles 
in a simple metric and then find a way 
to approximate this metric using data 
that a thread scheduler can easily ob- 
tain online. To this end, we discovered 


used), and the blue bar on the right in- | that memory-reuse profiles are highly 
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successful at modeling contention 
largely because they manage to capture 


| from the thread in question since the | based on memory-reuse profiles, we 
| high access rate shows that it allocates | looked foraway to approximate it using 


two important qualities related to con- 
tention: sensitivity and intensity. Sensi- 
tivity measures how much a thread suf- 
fers whenever it shares the cache with 
other threads. Intensity, on the other 
hand, measures how much a thread 
hurts other threads whenever it shares 
acache with them. Measuring sensitivi- 
ty and intensity appealed to us because 
together they capture the key informa- 
tion contained within memory-reuse 
profiles; we also had some ideas about 
how they could be approximated us- 
ing online performance data. Before 
learning how to approximate sensitiv- 
ity and intensity, however, we needed 
to confirm that these were indeed good 
bases for modeling cache contention 
among threads. To accomplish that, 
we formally derived the sensitivity and 
intensity metrics based on data in the 


memory-reuse profiles. After confirm- | 


ing that the metrics derived in this way 
did indeed accurately model conten- 


tion, we could then attempt to approxi- | 


mate them using just online data. 
Accordingly, we derived sensitivity $ 
and intensity Z for an application using 
data from its memory-reuse profile. To 
compute S, we applied an aggregation 
function to the reuse-frequency histo- 
gram. Intuitively, the higher the reuse 
frequency, the greater an application 
is likely to suffer from the loss of cache 
space due to contention with another 
application—signifying a higher sen- 
sitivity. To compute Z, we simply used 


ferred from the memory-reuse profile. 
Intuitively, the higher the access rate, 
the higher the degree of competition 


new cache lines while retaining old 
ones. Details for the derivation of S and 
Z are described in another article.'° 

Using the metrics S and Z, we then 
created another metric called Pain, 
where the Pain of thread A due to shar- 
ing a cache with thread B is the product 
of the sensitivity of A and the intensity 
of B, and vice versa. A combined Pain 
for the thread pair is the sum of the 
Pain of A due to B and the Pain of B due 
to A, as shown here: 


Pain(A|B) = S,4 *Z, 
Pain(B|A) = Sp * Zs 
Pain(A,B) - Pain(A|B) + Pain(B|A) 


Intuitively, Pain(A|B) approximates 
the performance degradation of A 
when A runs with B relative to running 
solo. It will not capture the absolute 
degradation entirely accurately, but it 
is good for approximating relative deg- 


radations. For example, given two po- | 


tential neighbors for A, the Pain metric 
can predict, which will cause a higher 
performance degradation for A. This is 
precisely the information a contention- 
aware scheduler would require. 

The Pain metric shown here as- 
sumes that only two threads share a 
cache. There is evidence, however, 
showing this metric applies equally 


_ well when more than two threads 


share the cache. In that case, in order 
to compute the Pain for a particular 


| thread as a consequence of running 
the cache-access rate, which can be in- | 


with all of its neighbors concurrently, 
the Pain owing to each neighbor must 
be averaged. 

After developing the Pain metric 


Figure 4. Computing the pain for all possible schedules. The schedule with the lowest Pain 


is chosen as the estimated best schedule. 


Schedule [(A,B), (C,D)}: 
Schedule [(A,C), (B,D)}: 
Schedule [(A,D), (B,C)}: 


Pain = Average( Pain(A,B), Pain(C, D 
Pain = Average( Pain(A,C), Pain(B, D)) 
Pain = Average( Pain(A,D), Pain(B, C 


)) 
)) 


Figure 5. The metric for comparing the actual best and the estimated best schedule. 


Estimated Best Schedule [(A,B), (C,D)}: 


DegradationEst = Average( Degrad(AIB), Degrad(BIA), Degrad(C|D), Degrad(DIC)) 


Actual Best Schedule [(A,C), (B,D)}: 


DegradationAct = Average( Degrad(A\C), Degrad(ClA), Degrad(BID), Degrad(DIB)) 


DegradationEst 


Degradation over actual best = ( DegradationAct — 
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just the data available online via stan- 
dard hardware performance counters. 


| This led us to explore two performance 


metrics to approximate sensitivity and 
intensity: the cache-miss rate and the 


| cache-access rate. Intuitively these met- 
| rics correlate with the reuse frequency 


and the intensity of the application. 
Our findings regarding which metric 
offers the best approximation are sur- 
prising, so to maintain some suspense, 
we postpone their revelation until the 
section entitled “Evaluation of Model- 
ing Techniques.” 


Using Contention 

Models in a Scheduler 

In evaluating the new models for cache 
contention, our goal was to determine 
how effective the models would be for 
constructing contention-free thread 
schedules. We wanted the model to 
help us find the best schedule and 
avoid the worst one (recall Figure 2). 
Therefore, we evaluated the models on 
the merit of the schedules they man- 


_ aged to construct. With that in mind, 


we describe here how the scheduler 
uses the Pain metric to find the best 
schedule. 

To simplify the explanation for this 
evaluation, we have a system with two 
pairs of cores sharing the two caches 
(as illustrated in Figure 1), but as men- 
tioned earlier, the model also works 
well with more cores per cache. In this 
case, however, we want to find the best 
schedule for four threads. The sched- 
uler would construct all the possible 
permutations of threads on this sys- 
tem, with each of the permutations be- 


| ing unique in terms of how the threads 


are paired on each memory domain. If 
we have four threads—A, B, C, and D— 
there will be three unique schedules: 
(1) {(A,B), (C,D)}; (2) {(A,C), (B,D) }; and 
(3) {(A,D), (B,C) }. Notation (A,B) means 
that threads A and B are co-scheduled 
in the same memory domain. For each 
schedule, the scheduler estimates the 
Pain for each pair: in schedule {(A,B), 
(C,D)} the scheduler would estimate 
Pain(A,B) and Pain(C,D) using the 
equations presented previously. Then 
it averages the Pain values of the pairs 
to estimate the Pain for the schedule 
as a whole. The schedule with the low- 
est Pain is deemed to be the estimated 


best schedule. Figure 4 isasummary of | 


this procedure. 

The estimated best schedule can 
be obtained either by using the Pain 
metric constructed via actual memory- 
reuse profiles or by approximating the 
Pain metric using online data. 

Once the best schedule has been es- 
timated, we must compare the perfor- 
mance of the workload in the estimat- 
ed best schedule with the performance 
achieved in the actual best schedule. 
The most direct way of doing this is to 
run the estimated best schedule on real 
hardware and compare its performance 
with that of the actual best schedule, 


which can be obtained by running all | 


schedules on real hardware and then 
choosing the best one. Although this 
is the most direct approach (which 
we used for some experiments in the 
study), it limited the number of work- 
loads we could test because running all 


possible schedules for a large number | 


In evaluating the 
new models for 
cache contention, 
our goal was to 
determine how 
effective the 


‘models would be 


for constructing 
contention-free 
thread schedules. 


practice 


good the Pain metric is in finding good 
scheduling assignments. 

Having the actual degradations en- 
abled us to construct the actual best 
schedule using the method shown in 


| Figure 5. The only difference was that, 
_ instead of using the model to compute 


Pain(A,B), we used the actual perfor- 
mance degradation that we had mea- 


sured on a real system (with Pain being 


equal to the sum of degradation of A 
running with B relative to running solo 
and the degradation of B running with 
A relative to running solo). 

Once we knew the actual best sched- 
ule, we needed a way to compare it with 
the estimated best schedule. The per- 
formance metric was the average deg- 
radation relative to solo execution for 
all benchmarks. For example, suppose 
the estimated best schedule was {(A,B), 
(C,D)}, while the actual best schedule 
was {(A,C), (B,D)}. We computed the 


_ average degradation for each sched- 


of workloads is time consuming. 


To evaluate a large number of work- | 


loads in a short amount of time, we 


invented a semi-analytical evaluation | 


methodology that relies partially on 
data obtained from tests on a real sys- 
tem and otherwise applies analytical 
techniques. Using this approach, we 
selected 10 benchmark applications 
from the SPEC CPU2006 suite to use in 
the evaluation. They were chosen using 
the minimum spanning-tree-cluster- 
ing method to ensure the applications 
represented a variety of memory-access 
patterns. 

We then ran all possible pairings of 


these applications on the experimental | 


platform, a Quad-Core Intel Xeon sys- 
tem, where each Quad-Core processor 
looked like the system depicted in Fig- 
ure 1. In addition to running each pos- 
sible pair of benchmark applications 
on the same memory domain, we ran 
each benchmark alone on the system. 
This gave us the measure for the actual 
degradation in performance for each 
benchmark as a consequence of shar- 
ing a cache with another benchmark, 


as opposed to running solo. Recall that | 
degradation relative to performance in | 


the solo mode is precisely the quantity 
approximated by the Pain metric. So by 
comparing the scheduling assignment 
constructed based on the actual deg- 
radation to that constructed based on 


the Pain metric, we can evaluate how | 
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| ule to find the difference between the 


degradation in the estimated best ver- 
sus the degradation in the actual one, 
as indicated in Figure 5. The notation 
Degrad(A|B) refers to the measured 
performance degradation of A when 
running alongside B, relative to A run- 
ning solo. 

This illustrates how to construct es- 
timated best and actual best schedules 


| for any four-application workload ona 


system with two memory domains so 
long as actual pair-wise degradations 
for any pair of applications have been 


| obtained on the experimental system. 


Using the same methodology, we can 
evaluate this same model on systems 
with a larger number of memory do- 
mains. In that case, the number of pos- 
sible schedules grows, but everything 
else in the methodology remains the 
same. In using this methodology, we 
assumed that the degradation for an 
application pair (A,B) would be the 
same whether it were obtained on a 
system where only A and B were run- 
ning or on a system with other threads, 
such as (C,D) running alongside (A,B) 


_ on another domain. This is not an en- 


tirely accurate assumption since, with 
additional applications running, there 
will be a higher contention for the 
front-side bus. Although there will be 
some error in estimating schedule-av- 
erage degradations under this method, 
the error is not great enough to affect 
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Figure 6. The percentage by which performance of schedules estimated to be 
best according to various modeling techniques varies from the actual best schedules. 
Low bars are good. 
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Figure 7. A breakdown of factors causing performance degradation due to contention 
for shared hardware on multicore systems based on tests using select applications in 
the SPEC CPU2006 suite. 
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Figure 8. A breakdown of factors causing performance degradation due to contention 

for shared hardware on multicore systems based on tests using select applications in 

the SPEC CPU2006 suite. These experiments were performed on an Intel Xeon (Cloverton) 
processor. We also obtained data showing that cache contention is not dominant on AMD 
Opteron systems. 
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the evaluation results significantly. 
Certainly, the error is not large enough 
to lead to a choice of the “wrong” best 
schedule. 


Evaluation of Modeling Techniques 
Here, we present the results obtained 
using our semi-analytical methodolo- 
gy, followed by the performance results 
obtained via experiments only. Figure 
6 compares the degradation over the 
actual best schedules (the method for 
which was indicated earlier) with es- 
timated best schedules constructed 
using various methods. The blue bar 
indicating Pain is the model that uses 
memory-reuse profiles to estimate Pain 
and find the best schedule (the method 
for which was set forth in Figure 4). In 
the red bar indicating Approx-Pain, the 
Pain for a given application running 
with another is estimated with the aid 
of data obtained online (we explain 
which data this is at the end of this sec- 
tion); once Pain has been estimated, 
we can once again use the method 
shown in Figure 4. In SDC, a previously 
proposed model based on memory- 
reuse profiles* can be used to estimate 
the performance degradation of an ap- 


| plication when it shares a cache with a 


co-runner. This estimated degradation 
can then be used in place of Pain(A|B); 
apart from that, the method shown 
in Figure 4 applies. Although SDC is 
rather complex for use in a scheduler, 
we compared it with our new models 
to evaluate how much performance 
was being sacrificed by using a sim- 
pler model. Finally, in Figure 6 the bar 
labeled Random shows the results for 
selecting a random-thread placement. 

Figure 6 shows how much worse the 
schedule chosen with each method 
ended up performing relative to the 
actual best schedule. This value was 
computed using the method shown in 
Figure 4. Ideally, this difference from 
the actual best ought to be small, so 
in considering Figure 6, remember 
that low bars are good. The results for 
four different systems—with four, six, 
eight, and 10 cores—are indicated. 
In all cases there were two cores per 
memory domain. (Actual results from 
a system with a larger number of cores 
per memory domain are shown later.) 
Each bar represents the average for all 
the benchmark pairings that could be 
constructed out of our 10 representa- 


tive benchmarks on a system with a 
given number of cores, such that there 
is exactly one benchmark per core. On 
four- and six-core systems, there were 
210 such combinations, whereas an 
eight-core system had 45 combina- 
tions, and a 10-core system had only 
one combination. For each combina- 


tion, we predicted the best schedule. | 
The average performance degradation 


from the actual best for each of these 
estimated best schedules is reported in 
each bar in the figure. 

The first thing we learned from the 
metric in Figure 5 was that the Pain 
model is effective for helping the sched- 
uler find the best thread assignment. It 
produces results that are within 1% of 


the actual best schedule. (The effect on | 


the actual execution time is explored 
later). We also found that choosing 
a random schedule produces signifi- 
cantly worse performance, especially 


as the number of cores grows. This is | 


setup of these experiments is described 


| inanother study.'° We arrived at the fol- 
_ lowing findings. 


First, it turns out that contention for 
the shared cache—the phenomenon by 
which competing threads end up evict- 
ing each others’ cache lines from the 
cache—is not the main cause of per- 
formance degradation experienced by 
competing applications on multicore 


systems. Contention for other shared | 


resources, such as the front-side bus, 
prefetching resources, and the memo- 
ry controller are the dominant causes 
for performance degradation (see Fig- 
ure 7). That is why the older memory- 


reuse model, designed to model cache | 


contention only, was not effective in 
our experimental environment. The 
authors of that model evaluated it on 
a simulator that did not model conten- 
tion for resources other than shared 
cache, and it turns out that, when ap- 
plied to a real system where other types 
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That is, they should not be co-sched- 
uled in the same memory domain. Al- 
though some researchers have already 
suggested this approach, it is not well 
understood why using the miss rate 
as a proxy for contention ought to be 
effective, particularly in that it contra- 
dicts the theory behind the popular 
memory-reuse model. Our findings 
should help put an end to this contro- 
versy. 

Based on this new knowledge, we 
have built a prototype of a contention- 
aware scheduler that measures the 
miss rates of online threads and de- 
cides how to place threads on cores 
based on that information. Here, we 
present some experimental data show- 
ing the potential impact of this conten- 
tion-aware scheduler. 


Implications 
Based on our understanding of conten- 
tion on multicore processors, we have 


significant in-that-a growing number 
of cores is the expected trend for future 
multicore systems. 

Figure 6 also indicates that the Pain 
approximated by way of an online 
metric works very well, coming within 
just 3% of the actual best schedule. At 
the same time, the SDC, a well-proven 
model from an earlier study, turns out 
to be less accurate. These results— 
both the effectiveness of the approxi- 
mated Pain metric and the disappoint- 
ing performance of the older SDC 
model—were quite unexpected. Who 
could have imagined that the best way 
to approximate the Pain metric would 
be to use the LLC miss rate? In other 
words, the LLC miss rate of a thread is 


the best predictor of both how much | 


the thread will suffer from contention 
(its sensitivity) and how much it will 
hurt others (its intensity). As explained 
at the beginning of this article, while 
there was limited evidence indicating 
that the miss rate predicts contention, 
it ran counter to the memory-reuse- 
based approach, which was supported 
by a much larger body of evidence. 

Our investigation of this paradox 


led us to examine the causes of con- | 
tention on multicore systems. We per- | 


formed several experiments that aimed 
to isolate and quantify the degree of 
contention for various types of shared 
resources: cache, memory controller, 
bus, prefetching hardware. The precise 


of contention were present, the modei | 


did not prove effective. 

On the other hand, cache-miss rate 
turned out to be an excellent predic- 
tor for contention for the memory 


_ controller, prefetching hardware, and 


front-side bus. Each application in our 
model was co-scheduled with the Mile 
application to generate contention. 
Given limitations of existing hardware 
counters, it was difficult to separate 


| the effects of contention for prefetch- 


ing hardware itself and the effects of 
additional contention for memory 
controller and front-side bus caused by 
prefetching. Therefore, the impact of 
prefetching shows the combined effect 
of these two factors. 

An application issuing many cache 
misses will occupy the memory con- 


troller and the front-side bus, so it will | 


not only hurt other applications that 


| use that hardware, but also end up suf- 


fering itself if this hardware is usurped 
by others. An application aggressively 
using prefetching hardware will also 
typically have a high LLC miss rate, be- 


cause prefetch requests for data that is | 


not in the cache are counted as cache 
misses. Therefore, a high miss rate is 
also an indicator of the heavy use of 
prefetching hardware. 

In summary, our investigation of 
contention-aware scheduling  algo- 
rithms has taught us that high-miss- 
rate applications must be kept apart. 
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built a prototype of a contention-aware 


_ scheduler for multicore systems called 


Distributed Intensity Online (DIO). 
The DIO scheduler distributes inten- 
sive applications across memory do- 
mains (and by intensive we mean those 
with high LLC miss rates) after measur- 
ing online the applications’ miss rates. 
Another prototype scheduler, called 


| Power Distributed Intensity (Power 


DI), is intended for scheduling applica- 
tions in the workload across multiple 
machines in a data center. One of its 
goals is to save power by determining 
how to employ as few systems as pos- 
sible without hurting performance. 
The following are performance results 
of these two schedulers. 

Distributed Intensity Online. Dif- 
ferent workloads offer different op- 
portunities to achieve performance 
improvements through the use ofa con- 
tention-aware scheduling policy. For 
example, a workload consisting of non- 
memory-intensive applications (those 
with low cache miss rates) will not expe- 
rience any performance improvement 
since there is no contention to alleviate 
in the first place. Therefore, for our ex- 
periments we constructed eight-appli- 
cation workloads containing from two 


_ to six memory-intensive applications. 


We picked eight workloads in total, all 
consisting of SPEC CPU2006 applica- 
tions, and then executed them under 
the DIO and the default Linux sched- 
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uler on an AMD Opteron system fea- | 


turing eight cores—four per memory 
domain. The results are shown in Fig- 


ure 8. The performance improvement | 


relative to default has been computed 
as the average improvement for all ap- 
plications in the workload (since not 
all applications are memory intensive, 
some do not improve). We can see that 
DIO renders workload-average perfor- 
mance improvements of up to 11%. 
Another potential use of DIO is as a 


way to ensure QoS (quality of service) | 
for critical applications since DIO es- | 
sentially provides a means to make sure 
the worst scheduling assignment is 
never selected, while the default sched- 
uler may occasionally suffer as a con- 
sequence of a bad thread placement. 
Figure 9 shows for each of the applica- 
tions as part of the eight test workloads 
its worst-case performance under DIO 
relative to its worst-case performance 
under the default Linux scheduler. The 


numbers are shown in terms of the per- 
centage of improvement or the worst- 
case behavior achieved under DIO 
relative to that encountered with the 
default Linux scheduler, so higher bars 
in this case are better. We can see that 
some applications are as much as 60% 
to 80% better off with their worst-case 
DIO execution times, and in no case 
did DIO do significantly worse than the 
default scheduler. 

Power Distributed Intensity. One of 


Figure 9. Performance of eight workloads under DIO relative to the default Linux scheduler. 
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Figure 10. Worst-case performance for each of the applications included as part 
of the eight test workloads. 
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Figure 11. Percentage reduction in EDP. : 
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the most effective ways to conserve CPU 
power consumption is to turn off un- 
used cores or entire memory domains 
in an active system. Similarly, if the 
workload is running on multiple ma- 
chines—for example, in a data center— 
power savings can be accomplished by 
clustering the workload on as few serv- 
ers as possible while powering down 
the rest. This seemingly simple solu- 
tion is a double-edged sword, however, 
because clustering the applications on 
just a few systems may cause them to 


| compete for shared system resources 


and thus suffer performance loss. As 
a result, more time will be needed to 
complete the workload, meaning that 
more energy will be consumed. In an 


_ attempt to save power it is also neces- 
| sary to consider the impact that cluster- 


ing can have on performance. A metric 
that takes into account both the energy 


consumption and the performance 


of the workload is the energy-delay 
product (EDP).* Based on our findings 
about contention-aware scheduling, 
we designed Power DI, a scheduling al- 
gorithm meant to save power without 


| hurting performance. 


Power DI works as follows: Assum- 
ing a centralized scheduler has knowl- 


| edge of the entire computing infra- 


structure and distributes incoming 
applications across all systems, Power 
DI clusters all incoming applications 
on as few machines as possible, except 
for those applications deemed to be 
memory intensive. Similarly, within a 


_ single machine, Power DI clusters ap- 


plications on as few memory domains 
as possible, with the exception of mem- 
ory-intensive applications. These ap- 
plications are not co-scheduled on the 
same memory domain with another ap- 
plication unless the other application 
has avery low cache miss rate (and thus 
alow memory intensity). To determine 
if an application is memory-intensive, 


Power DI uses an experimentally de- 
rived threshold of 1,000 misses per mil- 
lion instructions; an application whose 
LLC miss rate exceeds that amount is 
considered memory intensive. 
Although we did not have a data- 
center setup available to us to evaluate 
this algorithm, we simulated a multi- 
server environment in the following 
way. The in-house AKULA scheduling 
simulator created a schedule for a giv- 
en workload on a specified data-center 
setup, which in this case consisted of 
16 eight-core systems, assumed by 
the simulator to be Intel Xeon dual 
quad-core servers. Once the simulated 
scheduler decided how to assign ap- 
plications across machines and mem- 
ory domains within each machine, we 
computed the performance of the en- 
tire workload from the performance 
of the applications assigned by the 
scheduler to each eight-core machine. 


The performance on a single system 
—eesald + 


possible. Conversely, if all the applica- 
tions were memoty intensive, then the 
best policy was to spread them across 
memory domains so that no two ap- 
plications would end up running on 
the same memory domain. An intelli- 
gent scheduling policy must be able to 
decide to what extent clustering must 
be performed given the workload at 
hand. 

Figure 10 shows the EDP for three 
different scheduling methods: Power 
DI, a naive Spread method (which al- 
ways spreads applications across ma- 
chines to the largest extent possible), 
and the Cluster method (which in an 
attempt to save power always clusters 
applications on as few machines and 
as few memory domains as possible). 
The numbers are shown as a percent- 
age reduction in the EDP (higher is 
better) of Power DI and Spread over 
Cluster. 

We can see that when the fraction 


wer intas 


could be easily measured via experi 
mentation. This simulation method 
was appropriate for our environment, 
since there was no network commu- 
nication among the running applica- 
tions, meaning that inferring the over- 
all performance from the performance 
of individual system was reasonable. 
To estimate the power consump- 
tion, we used a rather simplistic model 
(measurements with the actual power 
meter are still under way) but captured 
the right relationships between power 
consumed in various load conditions. 
We assumed that a memory domain 
where all the cores are running appli- 
cations consumes one unit of power. A 
memory domain where one out of two 


cores are busy consumes 0.75 units of | 


power. A memory domain where all 
cores are idle is assumed to be in a very 
low power state and thus consumes 0 
units of power. We did not model the 
latency of power-state transitions. 

We constructed a workload of 64 
SPEC CPU2006 applications randomly 
drawn from the benchmark suite. We 
varied the fraction of memory-inten- 
sive applications in the workload from 
zero to 100%. The effectiveness of 
scheduling strategies differed accord- 
ing to the number of memory-inten- 
sive applications. For example, if there 
were no memory-intensive applica- 
tions, it was perfectly fine to cluster all 
the applications to the greatest extent 
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of memory-_intensive -appleations—in 
the workload is low, the naive Spread 
method does much worse than the 
Cluster method, but it beats Cluster 
as that fraction increases. Power DI, 
on the other hand, is able to adjust 
to the properties of the workload and 
minimize EDP in all cases, beating 
both Spread and Cluster—or at least 
matching them—for every single 
workload. 


Conclusion 

Contention for shared resources sig- 
nificantly impedes the efficient op- 
eration of multicore systems. Our re- 


search has provided new methods for | 


mitigating contention via scheduling 
algorithms. Although it was previous- 
ly thought that the most significant 
reason for contention-induced per- 
formance degradation had to do with 
shared cache contention, we found 
that other sources of contention— 
such as shared prefetching hardware 
and memory interconnects—are just 
as important. Our heuristic—the LLC 
miss rate—proves to be an excellent 
predictor for all types of contention. 
Scheduling algorithms that use this 
heuristic to avoid contention have the 
potential to reduce the overall comple- 
tion time for workloads, avoid poor 
performance for high-priority applica- 
tions, and save power without sacrific- 
ing performance. 


FEBRUARY 2010 


VOL. 53 


5. Knauerhase, R., Brett, P, Honit, 6. ci, T. and Hahn, 

S. Using OS observations to improve performance in 
multicore systems. IEEE Micro (2008), 54-66. 

6. SPEC: Standard Performance Evaluation Corporation; 
http://www.spec.org. 

7. Suh, G., Devadas, S. and Rudolph, L. A new memory 
monitoring scheme for memory-aware scheduling and 
partitioning. In Proceedings of the 8th International 
Symposium on High-performance Computer 
Architecture (2002), 117. 

8. Tam, D., Azimi, R., Soares, L. and Stumm, M. 
RapidMRC: approximating L2 miss rate curves on 
commodity systems for online optimizations. In 
Proceedings of the 14th International Conference on 
Architectural Support for Programming Languages 
and Operating Systems (2009), 121-132. 

9, Tam, D., Azimi, R. and Stumm, M. Thread clustering: 

sharing-aware scheduling on SMP-CMP-SMT 

multiprocessors. In Proceedings of the 2nd ACM 

SIGOPS/EuroSys European Conference on Computer 

Systems (2007), 47-58. 

Zhuravlevy, S., Blagodurov, S. and Fedorova, A. 

Addressing shared resource contention in multicore 

processors via scheduling. In Proceedings of the 15th 

International Conference on Architectural Support 

for Programming Languages and Operating Systems 

(2010). 


10. 


Alexandra Fedorova is an assistant professor of 
computer science at Simon Fraser University in Vancouver, 
Canada, where she co-founded the SYNAR (Systems, 
Networking and Architecture) research lab. Her research 
interests span operating systems and virtualization 
platforms for multicore processors, with a specific focus 
on scheduling. Recently she started a project on tools and 
techniques for parallelization of video games, which has 
led to the design of a new language for this domain. 


Sergey Blagodurov is a Ph.D. student in computer 
science at Simon Fraser University, Vancouver, Canada. 
His research focuses on operating-system scheduling 
on multicore processors and exploring new techniques 
to deliver better performance on non-uniform memory 
access (NUMA) multicore systems. 


Sergey Zhuravlev is a Ph.D. student in computer science 
at Simon Fraser University, Vancouver, Canada. His 
recent research focuses on scheduling on multiprocessor 
systems to avoid shared resource contention as well as 
simulating computing systems. 


© 2010 ACM 0001-0782/10/0200 $10.00 


NO. 2 | COMMUNICATIONS OF THE ACM 57 


oractice 


DOI:10.1145/1646353.1646372 


A translator framework enables the use 
| of model checking in complex avionics 
systems and other industrial settings. 


BY STEVEN P. MILLER, MICHAEL W. WHALEN, 
AND DARREN D. COFER 


Software 
Model 


Check 
Takes 


Ing 
Off 


ALTHOUGH FORMAL METHODS have been used in the 
development of safety- and security-critical systems 
for years, they have not achieved widespread industrial 
use in software or systems engineering. However, 
two important trends are making the industrial use 
of formal methods practical. The first is the growing 
acceptance of model-based development for the 
design of embedded systems. Tools such as MATLAB 
Simulink’ and Esterel Technologies SCADE Suite?’ are 
achieving widespread use in the design of avionics and 
automotive systems. The graphical models produced 
by these tools provide a formal, or nearly formal, 
specification that is often-amenable to formal analysis. 
The second is the growing power of formal verification 
tools, particularly model checkers. For many classes 
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of models they provide a “push-but- 

ton” means of determining if a model 
| meets its requirements. Since these 
| tools examine all possible combina- 
tions of inputs and state, they are 
much more likely to find design errors 
than testing. 


Here, we describe a_ translator 
framework developed by Rockwell Col- 
lins and the University of Minnesota 
that allows us to automatically trans- 
| late from some of the most popular 
commercial modeling languages to a 
variety of model checkers and theorem 
| provers. We describe three case studies 
in which these tools were used on in- 
| dustrial systems that demonstrate that 
formal verification can be used effec- 
tively on real systems when properly 
supported by automated tools. 


GHAFEN” BY HO-YEOL RYU 


OTOGRAPH: “FLU 


PH 


Model-Based Development 
Model-based development 
refers to the use of domain-specific, 
graphical modeling languages that 
can be executed and analyzed before 
the actual system is built. The use of 
such modeling languages allows the 
developers to create a model of the sys- 
tem, execute it on their desktops, ana- 
lyze it with automated tools, and use 
it to automatically generate code and 
test cases. 

Throughout this article we use MBD 
to refer specifically to software devel- 
oped using synchronous dataflow lan- 


guages suchas those found in MATLAB | 


Simulink and Esterel Technologies 
SCADE Suite. Synchronous modeling 
languages latch their inputs at the start 
of a computation step, compute the 


| next system state and its outputs as a 
(MBD) | single atomic step, and communicate 


between components using dataflow 
signals. This differs from the more 
general class of modeling languages 


that include support for asynchro- | 


nous execution of components and 
communication using message pass- 
ing. MBD has become very popular in 
the avionics and automotive industries 
and we have found synchronous data- 
flow models to be especially well suited 
for automated verification using model 
checking. 

Model checkers are formal verifi- 
cation tools that evaluate a model to 
determine if it satisfies a given set of 
properties.' A model checker will con- 


| 


sider every possible combination of | 


inputs and state, making the verifica- 
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tion equivalent to exhaustive testing of 
the model. If a property is not true, the 
model checker produces a counterex- 
ample showing how the property can 
be falsified. 

There are many types of mod- 
el checkers, each with their own 
strengths and weaknesses. Explicit 
state model checkers such as SPIN* 
construct and store a representation of 
each state visited. Implicit state (sym- 
bolic) model checkers use logical rep- 
resentations of sets of states (such as 
Binary Decision Diagrams) to describe 
regions of the model state space that 
satisfy the properties being evaluated. 
Such compact representations gener- 
ally allow symbolic model checkers to 
handle a much larger state space than 
explicit state model checkers. We have 
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used the BDD-based model checker | 
NuSMV? to analyze models with over 
10'° reachable states. 

More recent model checkers, such 
as SAL" and Prover Plug-In,’ use sat- 
isfiability modulo theories (SMT) solv- 
ers for reasoning about infinite state 
models containing real numbers and 
unbounded arrays. These checkers 
use a form of induction over the state 
transition relation to automatically 
prove that a property holds over all 
executable paths in a model. While | 
these tools can handle a larger class of 
models, the properties to be checked 
must be written to support inductive 
proof. 


The Translator Framework 

As part of NASA’s Aviation Safety Pro- 
gram (AvSP), Rockwell Collins and the 
University of Minnesota developed 
a product family of translators that 
bridge the gaps between some of the 
most popular commercial modeling 
languages and several model checkers 
and theorem provers.® An overview of 
this framework is shown in Figure 1. 

These translators work primar- 
ily with the Lustre formal specification 
language,’ but this is hidden from the 
users. The starting point for transla- 
tion is a design model in MATLAB Sim- 
ulink/Stateflow or Esterel Technolo- 
gies SCADE Suite/Safe State Machines. | 
SCADE Suite produces Lustre models | 
directly. Simulink or Stateflow models | 
can be imported using SCADE Suite or 
the Reactis'® tool and a translator de- 
veloped by Rockwell Collins. To ensure 
each Simulink or Stateflow construct 
has a well-defined semantics, the trans- 
lator restricts the models that it will 
accept to those that can be translated 
unambiguously into Lustre. 

Once in Lustre, the specification 
is loaded into an abstract syntax tree 
(AST) and a number of transformation | 
passes are applied to it. Each transfor- 
mation pass produces anew Lustre AST | 
that is syntactically closer to the target 
specification language and preserves 
the semantics of the original Lustre 
specification. This allows all Lustre 
type checking and analysis tools to be 
used as debugging aids during the de- 
velopment of the translator. When the 
AST is sufficiently close to the target 
language, a pretty printer is used to 
output the target specification. | 
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A model checker 
will consider 
every possible 
combination of 
inputs and state, 
making the 
verification 
equivalent to 
exhaustive testing 
of the model. 

if a property is not 
true, the model 
checker produces 
a counterexample 
showing how 

the property can 
be falsified. 
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We refer to our translator framework 


_ asaproduct family since most transfor- 
mation passes are reused in the trans- 


lators for each target language. Reuse 
of the transformation passes makes it 
much easier to support new target lan- 
guages; we have developed new trans- 


| lators in a matter of days. The number 


of transformation passes depends on 
the similarity of the source and target 


| languages and on the number of opti- 


mizations to be made. Our translators 
range in size from a dozen to over 60 
passes. 

The translators produce highly op- 
timized specifications appropriate for 
the target language. For example, when 
translating to NuSMV, the translator 
eliminates as much redundant inter- 
nal state as possible, making it very ef- 
ficient for BDD-based model checking. 
When translating to the PVS theorem 
prover, the specification is optimized 
for readability and to support the de- 
velopment of proofs in PVS. When gen- 
erating executable C or Ada code, the 


| code is optimized for execution speed 


on the target processor. These optimi- 


zations can have a dramatic effect on 


the target analysis tools. For example, 
optimization passes incorporated into 
the NuSMV translator reduced the 
time required for NuSMV to check one 
model from over 29 hours to less than 
a second. 

However, some optimizations are 


| better incorporated into the verifica- 
| tion tools rather than the translator. 


For example, predicate abstraction’ is 
a well-known technique for reducing 
the size of the reachable state space, 
but automating this during translation 
would require a tight interaction be- 
tween our translator and the analysis 
tool to iteratively refine the predicates 


Counterexample. 


Step 1 2 3 
Inputs 

Start 0 1 0 

Clear 0 0 0 

Door Closed ie} al 0 

Steps toCook 0O ak al 
Outputs 

Mode Setup Cooking Cooking 
Steps 

Remaining 0 1 6) 


Figure 1. The translator framework. 
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based on the counterexamples. Since 
many model checkers already imple- 
ment this technique, we have not tried 


to incorporate it into our translator | 


framework. 
We have developed tools to translate 


model checkers into two formats. The 
first is a simple spreadsheet that shows 
the inputs and outputs of the model for 
each step (similar to steps noted in the 
accompanying table). The second is a 
test script that can be read by the Re- 
actis tool to step forward and backward 
through the counterexample in the 
Reactis simulator. 

Our translator framework currently 
supports input models written in Sim- 
ulink, Stateflow, and SCADE. It gener- 
ates specifications for the NuSMYV, SAL, 
and Prover model checkers, the PVS 
and ACL2 theorem provers, and C and 
Ada source code. 


‘A Small Example 


To make these ideas concrete, we pres- | 


ent avery small example, the mode log- 
ic for a simple microwave oven shown 
in Figure 2. The microwave initially 
starts in Setup mode. It transitions to 
Running mode when the Start button 
is pressed and the Steps Remaining to 
cook (initially provided by the keypad 
entry subsystem) is greater than zero. 
On transition to Running mode, the 
controller enters either the Cooking or 
Suspended submode, depending on 
whether Door Closed is true. In Cooking 
mode, the controller decrements Steps 
Remaining on each step. If the door is 
opened in Cooking mode or the opera- 
tor presses the Clear button, the con- 


oe 
mee 
of! 


***. @ SAL Infinite 
Model Checker 


troller enters the Suspended submode. 
From the Suspended submode, the op- 
erator can return to Cooking submode 
by pressing the Start button while the 
door is closed, or return to Setup mode 
by pressing the Clear button. When 
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Translation of the model into 
NuSMV and checking this property 
takes only a few seconds and yields the 
counterexample shown in Table 1. 

In step 2 of the counterexample, 
we see the value of Start change from 
0 to 1, indicating the start button was 
pressed. Also in step 2, the door is 
closed and Steps Remaining takes on 
the value 1. As a result, the microwave 


| enters Cooking mode in step 2. In step 
| 3, the door is opened, but the micro- 


wave remains in Cooking mode, violat- 
ing our safety property. 

To better understand how this hap- 
pened, we use Reactis to step through 
the generated counterexample. This 
reveals that instead of taking the tran- 
sition from Cooking to Suspended when 
the door is opened, the microwave took 
the transition from Cooking to Cook- 
ing that decrements Steps Remaining 
because this transition has a higher 
priority (priority 1) than the transition 


D399 ete ten oe 


Steps Remaining decrements tO Zero, | 


the controller exits Running mode and 
returns to Setup mode. 

Since this model consists only of 
Boolean values (Start, Clear, Door 


Closed), enumerated types (mode), | 


and two small integers (Steps Remain- 
ing and Steps to Cook range from 0 to 


| 639, the largest value that can be en- 


tered on the keypad) it is well suited 
for analysis with a symbolic model 
checker such as NuSMV. A valuable 
property to check is that the door is 
always closed when the microwave is 


cooking. In CTL' (one of the property | 


specification languages of NuSMV), 
this is written as: 
AG(Cooking -> Door _ Closed) 


from Cooking to Suspended (priority 2). 
Worse, the microwave would continue 
cooking with the door open until Steps 
Remaining becomes zero. Changing 
the priority of these two transitions 
and rerunning the model checker 
shows that in all possible states, the 
door is always closed if the microwave 
is cooking. 

While this example is tiny, the two 
integers (Steps Remaining and Steps 
to Cook) still push its reachable state 
space to 9.8 x 10° states. Also note that 
the model checker does not necessar- 
ily find the “best” counterexample. It 
actually would have been clearer if the 
Steps Remaining had been set to a value 
larger than 1 in step 2. However, this 


| counterexample is very typical. In the 


Figure 2. Microwave mode logic. 


{steps_remaining 
=steps_to_cook;} 


[start && 
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[steps_remaining <=0] 


{clear] 
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RUNNING 


[steps_remaining > 0} 
/steps_remaining--; 


COOKING 
entry: mode=2; 


1 


——. J 
{door_closed] 
2 
2 


[start && { [clear |... 


3 door_closed] !door_closed] 


“SUSPENDED _ 
| entry: mode=3; 
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production models that we have ex- 
amined, very few counterexamples are | 
longer than a few steps. 


Case Studies 
To be of any real value, model check- 
ing must be able to handle much larg- 
er problems. Three case studies on the 
application of our tools to industrial 
examples are described here. A fourth 
case study is discussed in Miller et al.’ 

ADGS-2100 Window Manager. One of | 
the largest and most successful appli- 
cations of our tools was to the ADGS- 
2100 Adaptive Display and Guidance 
System Window Manager." In modern 
aircraft, pilots are provided aircraft 
status primarily through computer- 
ized display panels similar to those 
shown in Figure 3. The ADGS-2100 is 
a Rockwell Collins product that pro- 
vides the heads-down and heads-up 
displays and display management 
software for next-generation commer- 
cial aircraft. 

The Window Manager (WM) ensures 
that data from different applications is 


Figure 3. Pilot display panels. 
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routed to the correct display panel. In 
normal operation, the WM determines 
which applications are being displayed 
in response to the pilot selections. 
However, in the case of a component 
failure, the WM also decides which in- 
formation is most critical and routes 
this information from one of the re- 
dundant sources to the most appropri- 
ate display panel. The WM is essential 
to the safe flight of the aircraft. If the 
WM contains logic errors, critical flight 
information could be unavailable to 
the flight crew. 

While very complex, the WM is speci- 
fied in Simulink using only Booleans 
and enumerated types, making it ideal 
for verification using a BDD-based mod- 
el checker such as NuSMV. The WM is 
composed of five main components 
that can be analyzed independently. 
These five components contain a total 


| of 16,117 primitive Simulink blocks 
that are grouped into 4,295 instances | 
_ of Simulink subsystems. The reachable 


state space of the five components rang- 
es from 9.8 x 10° to 1.5 x 10” states. 
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Ultimately, 563 properties about 
the WM were developed and checked, 
and 98 errors were found and correct- 
ed in early versions of the WM model. 
This verification was done early in the 
design process while the design was 
still changing. By the end of the proj- 
ect, the WM developers were check- 
ing the properties after every design 


| change. 


CerTA FCS Phase I. Our second case 
study was sponsored by the U.S. Air 
Force Research Laboratory (AFRL) un- 


| der the Certification Technologies for 


Advanced Flight Critical Systems (Cer- 
TA FCS) program in order to compare 
the effectiveness of model checking 
and testing.’ In this study, we applied 
our tools to the Operational Flight 
Program (OFP) of an unmanned aerial 


| vehicle developed by Lockheed Mar- 


tin Aerospace. The OFP is an adaptive 
flight control system that modifies its 
behavior in response to flight condi- 


| tions. Phase I of the project concen- 


trated on applying our tools to the 
Redundancy Management (RM) log- 


ic, which is based almost entirely on 
Boolean and enumerated types. 

The RM logic was broken down | 
into three components that could be | 
analyzed individually. While relatively 
small (they contained a total of 169 
primitive Simulink blocks organized 
into 23 subsystems, with reachable 
state spaces ranging from 2.1 x 10? 
to 6.0 x 10" states), the RM logic was 
replicated in the OFP once for each of 
the 10 control surfaces on the aircraft, 
making it a significant portion of the 
OFP logic. 

To compare the effectiveness of 
model checking and testing at discov- 
ering errors, this project had two inde- 
pendent verification teams, one that | 
used testing and one that used model | 
checking. The formal verification team 
developed a total of 62 properties from 
the OFP requirements and checked | 
these properties with the NuSMV mod- | 


el checker, uncovering 12 errors in the 
RM logie Of these 19 


RM logie_Of these 12 errors, four were 
classified by Lockheed Martin as sever- 
ity 3 (only severity 1 and 2 can affect 
the safety of flight), two were classified | 
as severity 4, two resulted in require- 
ments changes, one was redundant, 
and three resulted from requirements 
that had not yet been implemented in 
the release of the software. 

In similar fashion, the testing team 
developed a series of tests from the 
same OFP requirements. Even though 
the testing team invested almost half 
as much time in testing as the for- 
mal verification team spent in model 
checking, testing failed to find any er- 
rors. The main reason for this was that 
the demonstration was not a compre- 
hensive test program. While some of 
these errors could be found through 
testing, the cost would be much high- | 
er, both to find and fix the errors. In 
addition, the errors found through | 
model checking tended to be intermit- 
tent, near simultaneous, or combina- 
tory sequences of failures that would 
be very difficult to detect through test- 
ing. The conclusion of both teams was 
that model checking was shown to 
be more cost effective than testing in 
finding design errors. 

CerTa FCS Phase II. The purpose of 
Phase II of the CerTA FCS project was | 
to investigate whether model checking 
could be used to verify large, numeri- | 
cally intensive models. In this study, 


Running a set of 
properties after 
each model revision 
is a quick and 

easy way to see 

if anything has 
been broken. We 
encourage our 
developers 

to “check your 
models early and 
check them often.” 
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the translation framework and model 
checking tools were used to verify 
important properties of the Effector 
Blender (EB) logic of an OFP for a UAV 
similar to that verified in Phase I. 

The EB is a central component of 
the OFP that generates the actuator 
commands for the aircraft’s six control 
surfaces. It is a large, complex model 


that repeatedly manipulates a 3 x 6 


matrix of floating point numbers. It in- 
puts 32 floating point inputs and a 3 x 
6 matrix of floating point numbers and 
outputs a 1 x 6 matrix of floating point 


_ numbers. It contains over 2,000 basic 


Simulink blocks organized into 166 
Simulink subsystems, many of which 
are Stateflow models. 

Because of its extensive use of float- 
ing point numbers and large state 
space, the EB cannot be verified using 
a BDD-based model checker such as 
NuSMV. Instead, the EB was analyzed 
using the Prover SMT-solver from 
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Prover Technologies. Even with the ad- 
ditional capabilities of Prover, several 
new issues had to be addressed, the 
hardest being dealing with floating 
point numbers. 

While Prover has powerful decision 
procedures for linear arithmetic with 
real numbers and bit-level decision 


| procedures for integers, it does not 


have decision procedures for floating 
point numbers. Translating the float- 
ing point numbers into real numbers 
was rejected since much of the arith- 
metic in the EB is inherently nonlinear. 
Also, the use of real numbers would 
mask floating point arithmetic errors 
such as overflow and underflow. 
Instead, the translator framework 
was extended to convert floating point 
numbers to fixed point numbers using 
a scaling factor provided by the OFP 
designers. The fixed point numbers 
were then converted to integers using 
bit-shifting to preserve their magni- 
tude. While this allowed the EB to be 


| verified using Prover’s bit-level integer 


decision procedures, the results were 
unsound due to the loss of precision. 
However, if errors were found in the 
verified model, their presence could 
easily be confirmed in the original 
model. This allowed the verification 
to be used as a highly effective debug- 
ging step, even though it did not guar- 
antee correctness. 

Determining what properties to 
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verify was also a difficult problem. The 
requirements for the EB are actually 
specified for the combination of the 
EB and the aircraft model, but check- 


ing both the EB and the aircraft model | 


exceeded the capabilities of the Prover 
Plug-In model checker. After consulta- 
tion with the OFP designers, the verifi- 
cation team decided to verify whether 
the six actuator commands would al- 


ways be within dynamically computed | 


upper and lower limits. Violation of 
these properties would indicate a de- 
sign error in the EB logic. 

Even with these adjustments, the 
EB model was large enough that it had 
to be decomposed into a hierarchy of 
components several levels deep. The 
leaf nodes of this hierarchy were then 
verified using Prover Plug-In and their 
composition was manually verified 
using manual proofs. This approach 
also ensured that unsoundness could 
not be introduced through circular 
reasoning since Simulink enforces 
the absence of cyclic dependencies 
between atomic subsystems. 

Ultimately, five errors in the EB de- 
sign logic were discovered and correct- 
ed through model checking of these 
properties. In addition, several poten- 
tial errors that were being masked by 
defensive design practices were found 
and corrected. 


Lessons from the Case Studies 

The case studies described here dem- 
onstrate that model checking can be 
effectively used to find errors early in 
the development process for many 
classes of models. In particular, even 
very complex models can be verified 
with BDD-based model checkers if 
they consist primarily of Boolean and 
enumerated types. Every industrial 
system we have studied contains large 
sections that either meet this con- 
straint or can be made to meet it with 
some alteration. 

For this class of models, the tools 
are simple enough for developers to use 
them routinely and without extensive 
training. In our experience, a single day 
of training and a low level of ongoing 
mentoring are usually sufficient. This 
also makes it practical to perform mod- 
el checking early in the development 
process while a model is still changing. 

Running a set of properties after 
each model revision is a quick and 
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easy way to see if anything has been 


| broken. We encourage our developers 


to “check your models early and check 


them often.” The time spent model | 


checking is recovered several times 
over by avoiding rework during unit 
and integration testing. 

Since model checking examines ev- 
ery possible combination of input and 
state, itis also far more effective at find- 
ing design errors than testing, which 
can only check a small fraction of the 
possible inputs and states. As demon- 
strated by the CerTA FCS Phase I case 
study, it can also be more cost effective 
than testing. 


Future Directions 

There are many directions for further 
research. As illustrated in the CerTA 
FCS Phase II study, numerically inten- 


sive models still pose a challenge for | 


model checking. SMT-based model 
checkers hold promise for verification 
of these systems, but the need to write 
properties that can be verified through 
induction over the state transition rela- 
tion make them more difficult for de- 
velopers to use. 

Most industrial models used to 
generate code make extensive use of 
floating point numbers. Other models, 
particularly those that deal with spatial 
relationships such as navigation, make 
extensive use of trigonometric and other 
transcendental functions. A sound and 
efficient way of checking systems using 
floating point arithmetic and transcen- 
dental functions would be very helpful. 


It can also be difficult to determine | 


how many properties must be checked. 
Our experience has been that checking 
even a few properties will find errors, 
but that checking more properties 
will find more errors. Unlike testing 


| for which many objective coverage cri- 


teria have been developed, complete- 
ness criteria for properties do not seem 
to exist. Techniques for developing 
or measuring the adequacy of a set of 
properties are needed. 

As discussed in the CerTA FCS Phase 
IIcase study, the verification of very large 
models may be achieved by using model 


_ checking on subsystems and more tra- 


ditional reasoning to compose the sub- 
systems. Combining model checking 
and theorem proving in this way could 
be a very effective approach to the com- 
positional verification of large systems. 
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How Coverity built a bug-finding tool, and 
a business, around the unlimited supply 
of bugs in software systems. 


BY AL BESSEY, KEN BLOCK, BEN CHELF, ANDY CHOU, 
BRYAN FULTON, SETH HALLEM, CHARLES HENRI-GROS, 
ASYA KAMSKY, SCOTT MCPEAK, AND DAWSON ENGLER 


A Few Billion 
Lines of 
Code Later 


Using Static Analysis 
to Find Bugs in 
the Real World 


IN 2002, COVERITY commercialized’ a research static 
bug-finding tool.®° Not surprisingly, as academics, 
our view of commercial realities was not perfectly 
accurate. However, the problems we encountered 
were not the obvious ones. Discussions with tool 
researchers and system builders suggest we were 
not alone in our naiveté. Here, we document some 
of the more important examples of what we learned 
developing and commercializing an industrial- 
strength bug-finding tool. 

We built our tool to find generic errors (such as 
memory corruption and data races) and system- 
specific or interface-specific violations (such as 
violations of function-ordering constraints). The tool, 
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like all static bug finders, leveraged 
the fact that programming rules often 
map clearly to source code; thus static 
inspection can find many of their vio- 
lations. For example, to check the rule 


- “acquired locks must be released,” a 


checker would look for relevant opera- 


tions (such as lock() and unlock()) 


and inspect the code path after flagging 
rule disobedience (suchas lock() with 
no unlock() and double locking). 

For those who keep track of such 
things, checkers in the research system 
typically traverse program paths (flow- 


| sensitive) in a forward direction, going 


across function calls (inter-procedural) 
while keeping track of call-site-specific 
information (context-sensitive) and 
toward the end of the effort had some 


| of the support needed to detect when a 


path was infeasible (path-sensitive). 

A glance through the literature re- 
veals many ways to go about static bug 
finding.’?"7*"! For us, the central re- 


| ligion was results: If it worked, it was 


good, and if not, not. The ideal: check 
millions of lines of code with little 


/ manual setup and find the maximum 


number of serious true errors with the 


| minimum number of false reports. As 


much as possible, we avoided using an- 


| notations or specifications to reduce 
/ manual labor. 


Like the PREfix product,’ we were 
also unsound. Our product did not veri- 
fy the absence of errors but rather tried 
to find as many of them as possible. Un- 
soundness let us focus on handling the 
easiest cases first, scaling up as it proved 
useful. We could ignore code constructs 
that led to high rates of false-error mes- 
sages (false positives) or analysis com- 
plexity, in the extreme skipping prob- 


_ lematic code entirely (such as assembly 


statements, functions, or even entire 


| files). Circa 2000, unsoundness was 


controversial in the research communi- 
ty, though it has since become almost a 
de facto tool bias for commercial prod- 
ucts and many research projects. 
Initially, publishing was the main 
force driving tool development. We 
would generally devise a set of checkers 
or analysis tricks, run them over a few 


million lines of code (typically Linux), 
count the bugs, and write everything 
up. Like other early static-tool research- 
ers, we benefited from what seems an 
empirical law: Assuming you have a rea- 
sonable tool, if you run it over a large, 
previously unchecked system, you 
will always find bugs. If you don’t, the 
immediate knee-jerk reaction is that 
something must be wrong. Misconfigu- 
ration? Mistake with macros? Wrong 
compilation target? If programmers 
must obey a rule hundreds of times, 
then without an automatic safety net 
they cannot avoid mistakes. Thus, even 
our initial effort with primitive analysis 
found hundreds of errors. 


This is the research context. We now 
describe the commercial context. Our 
rough view of the technical challenges of 
commercialization was that given that 
the tool would regularly handle “large 
amounts” of “real” code, we needed 
only a pretty box; the rest was a business 


issue. This view was naive. While we in- 
clude many examples of unexpected ob- 
stacles here, they devolve mainly from 
consequences of two main dynamics: 

First, in the research lab a few peo- 
ple check a few code bases; in reality 
many check many. The problems that 
show up when thousands of program- 
mers use a tool to check hundreds (or 
even thousands) of code bases do not 
show up when you and your co-authors 
check only a few. The result of sum- 
ming many independent random vari- 
ables? A Gaussian distribution, most 
of it not on the points you saw and 
adapted to in the lab. Furthermore, 
Gaussian distributions have tails. As 
the number of samples grows, so, too, 
does the absolute number of points 
several standard deviations from the 
mean. The unusual starts to occur with 
increasing frequen 


W. Bradford Paley’s CodeProfiles was 
originally commissioned for the Whitney 
Museum of American Art’s “CODeDOC” 
Exhibition and later included in MoMA’s 
“Design and the Elastic Mind” exhibition. 
CodeProfiles explores the space of code 
itself; the program reads its source into 
memory, traces three points as they once 
moved through that space, then prints itself 
on the page. 


For code, these featur include 
problematic idioms, the types of false 
positives encountered, the distance 
of a dialect from a language standard, 
and the way the build wo For de- 
velopers, variations appear in raw abil- 


ity, knowledge, the amount they care 
about bugs, false positives, and the 
types of both. A given company won’t 


deviate in all these features but, given 
the number of features to choose from, 
often includes at least one weird odd- 
ity. Weird is not good. Tools want ex- 
pected, Expected you can tune a tool to 
handle; surprise interacts badly with 
tuning assumptions. 

Second, in the lab the user’s values, 
knowledge, and incentives are those 
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of the tool builder, since the user and 
the builder are the same person. De- 
ployment leads to severe fission; us- 
ers often have little understanding of 
the tool and little interest in helping 
develop it (for reasons ranging from 
simple skepticism to perverse reward 
incentives) and typically label ar 

message they find confusing as false. A 


tool that works well under these con- 
straints looks very different from one 
tool builders design for themselves. 
However, for every user who lacks 
the understanding or motivation one 
might hope for, another is eager to un- 
derstand howit all works (or perhaps al- 
ready does), willing to help even beyond 
what one might consider reasonable. 
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| Such champions make sales as easil 
their antithesis blocks them. However, 
since their main requirements tend to 
be technical (the tool must work) the 
reader likely sees how to make them 
happy, so we rarely discuss them here. 
Most of our lessons come from two 
different styles of use: the initial trial of 
the tool and how the company uses the 


tool after buying it. The trial is a pre-sale 
demonstration that attempts to show 
that the tool works well on a potential 
customer’s code. We generally ship a 
salesperson and an engineer to the cus- 
tomer’s site. The engineer configures 
the tool and runs it over a given code 
| base and presents results soon after. Ini- 

tially, the checking run would happen 


in the morning, and the results meeting 
would follow in the afternoon; as code 
size at trials grows it’s not uncommon 
to split them across two (or more) days. 

Sending people to a trial dramatical- 
ly raises the incremental cost of each 
sale. However, it gives the non-trivial 
benefit of letting us educate customers 
(so they do not label serious, true bugs 


as false positives) and do real-time, ad 
hoc workarounds of weird customer 
system setups. 

The trial structure is a harsh test for 
any tool, and there is little time. The 
checked system is large (millions of 
lines of code, with 20-30MLOC a pos- 
sibility). The code and its build system 
are both difficult to understand. How- 


ever, the tool must routinely go from 
never seeing the system previously to 
getting good bugs in a few hours. Since 
we present results almost immediately 
after the checking run, the bugs must 
be good with few false positives; there 
is no time to cherry pick them. 

Furthermore, the error messages 
must be clear enough that the sales en- 
gineer (who didn’t build the checked 
system or the tool) can diagnose and 
explain them in real time in response 
to “What about this one?” questions. 

The most common usage model for 
the product has companies run it as 
part of their nightly build. Thus, most 
require that checking runs complete in 
12 hours, though those with larger code 
bases (10+MLOC) grudgingly accept 
24 hours. A tool that cannot analyze 
at least 1,400 lines of code per minute 
makes it difficult to meet these targets. 
During a checking run, error messages 
are put in a database for subsequent 
triaging, where users label them as 
true errors or false positives. We spend 
significant effort designing the system 
so these labels are automatically reap- 
plied if the error message they refer to 
comes up on subsequent runs, despite 
code-dilating edits or analysis-chang- 
ing bug-fixes to checkers. 

As of this writing (December 2009), 
approximately 700 customers have 
licensed the Coverity Static Analysis 
product, with somewhat more than a 
billion lines of code among them. We 
estimate that since its creation the tool 
has analyzed several billion lines of 
code, some more difficult than others. 

Caveats. Drawing lessons from a sin- 
gle data point has obvious problems. 
Our product’s requirements roughly 
form a “least common denominator” 
set needed by any tool that uses non- 
trivial analysis to check large amounts 
of code across many organizations; the 
tool must find and parse the code, and 
users must be able to understand er- 
ror messages. Further, there are many 
ways to handle the problems we have 
encountered, and our way may not be 
the best one. We discuss our methods 
more for specificity than as a claim of 
solution. 

Finally, while we have had success 
as a Static-tools company, these are 
small steps. We are tiny compared to 
mature technology companies. Here, 
too, we have tried to limit our discus- 
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sion to conditions likely to be true ina 
larger setting. 


Laws of Bug Finding 

The fundamental law of bug finding 
is No Check = No Bug. If the tool can’t 
check a system, file, code path, or given 
property, then it won’t find bugs in it. 
Assuming a reasonable tool, the first 
order bound on bug counts is just how 
much code can be shoved through the 
tool. Ten times more code is 10 times 
more bugs. 

We imagined this law was as simple 
| a statement of fact as we needed. Un- 
fortunately, two seemingly vacuous cor- 
ollaries place harsh first-order bounds 
on bug counts: 

Law: You can’t check code you don’t 
see. It seems too trite to note that check- 
ing code requires first finding it... until 
you try to do so consistently on many 
large code bases. Probably the most re- 
liable way to check a system is to grab its 
code during the build process; the build 
system knows exactly which files are in- 
cluded in the system and how to com- 
pile them. This seems like a simple task. 
Unfortunately, it’s often difficult to un- 
derstand what an ad hoc, homegrown 
build system is doing well enough to ex- 
tract this information, a difficulty com- 
pounded by the near-universal absolute 
edict: “No, you can’t touch that.” By de- 
fault, companies refuse to let an exter- 
nal force modify anything; you cannot 
modify their compiler path, their bro- 
ken makefiles (ifthey have any), orin any 
way write or reconfigure anything other 
than your own temporary files. Which is 
fine, since if you need to modify it, you 
most likely won’t understand it. 

Further, for isolation, companies 
often insist on setting up a test ma- 
chine for you to use. As a result, not 
infrequently the build you are given to 
check does not work in the first place, 
which you would get blamed for if you 
had touched anything. 

Our approach in the initial months 
of commercialization in 2002 was a 
low-tech, read-only replay of the build 
commands: run make, record its out- 
put in a file, and rewrite the invoca- 
tions to their compiler (such as gcc) 
to instead call our checking tool, then 
rerun everything. Easy and simple. 
This approach worked perfectly in the 
| lab and for a small number of our ear- 
liest customers. We then had the fol- 
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lowing conversation with a potential | 
customer: | 

“How do we run your tool?” 

“Just type ‘make’ and we’ll rewrite 
its output.” 

“What's ‘nake’? We use ClearCase.” 

“Uh, What’s ClearCase?” 

This turned out to be a chasm we 
couldn’t cross. (Strictly speaking, the 
customer used ‘ClearMake,’ but the 
superficial similarities in name are en- 
tirely unhelpful at the technical level.) 
We skipped that company and went 
to a few others. They exposed other | 
problems with our method, which we 
papered over with 90% hacks. None 
seemed so troublesome as to force us | 
to rethink the approach—at least until 
we got the following support call from | 
a large customer: 

“Why is it when I run your tool, I | 
have to reinstall my Linux distribution 
from CD?” 

This was indeed a puzzling ques- 
tion. Some poking around exposed the 
following chain of events: the compa- 
ny’s make used a novel format to print | 
out the absolute path of the directory | 
in which the compiler ran; our script 
misparsed this path, producing the 
empty string that we gave as the desti- 
nation to the Unix “cd” (change direc- 
tory) command, causing it to change | 
to the top level of the system; it ran 
“rm -rf *” (recursive delete) during 
compilation to clean up temporary 
files; and the build process ran as root. 
Summing these points produces the 
removal of all files on the system. 

The right approach, which we have 
used for the past seven years, kicks off 
the build process and intercepts every 
system call it invokes. As a result, we can 
see everything needed for checking, in- 
cluding the exact executables invoked, 
their command lines, the directory 
they run in, and the version of the com- 
piler (needed for compiler-bug work- 
arounds). This control makes it easy to 
grab and precisely checkall source code, 
to the extent of automatically changing 
the language dialect on a per-file basis. 

To invoke our tool users need only 
call it with their build command as an 
argument: 


cov-build <build commands 


We thought this approach was bullet- 
proof. Unfortunately, as the astute read- 
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A misunderstood 
explanation 
means the error is 
ignored or, worse, 
transmuted into 

a false positive. 
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| er has noted, it requires a command 


prompt. Soon after implementing it we 
went to a large company, so large it had 
a hyperspecialized build engineer, who 
engaged in the following dialogue: 

“How do I run your tool?” 

“Oh, it’s easy. Just type ‘cov-build’ 
before your build command.” 

“Build command? I just push this 
[GUI] button...” 

Social vs. technical. The social restric- 
tion that you cannot change anything, 


_ no matter how broken it may be, forces 


ugly workarounds. A representative ex- 
ample is: Build interposition on Win- 
dows requires running the compiler in 
the debugger. Unfortunately, doing so 
causes avery popular windows C++ com- 
piler—Visual Studio C++ .NET 2003—to 
prematurely exit with a bizarre error 
message. After some high-stress fuss- 
ing, it turns out that the compiler has a 
use-after-free bug, hit when code used a 
Microsoft-specific C language extension 
(certain invocations ofits #using direc- 
tive). The compiler runs fine in normal 
use; when it reads the freed memory, 
the original contents are still there, so 
everything works. However, when run 


_ with the debugger, the compiler switch- 


es to using a “debug malloc,” which on 


| each free call sets the freed memory 
| contents to a garbage value. The subse- 


quent read returns this value, and the 
compiler blows up with a fatal error. 
The sufficiently perverse reader can no 
doubt guess the “solution.”* 

Law: You can’t check code you can’t 
parse. Checking code deeply requires 
understanding the code’s semantics. 
The most basic requirement is that you 
parse it. Parsing is considered a solved 
problem. Unfortunately, this view is na- 
ive, rooted in the widely believed myth 
that programming languages exist. 

The C language does not exist; nei- 
ther does Java, C++, and C#. While a 
language may exist as an abstract idea, 
and even have a pile of paper (a stan- 
dard) purporting to define it, a stan- 
dard is not a compiler. What language 
do people write code in? The character 
strings accepted by their compiler. 
Further, they equate compilation with 
certification. A file their compiler does 


a Immediately after process startup our tool 
writes 0 to the memory location of the “in de- 
bugger” variable that the compiler checks to 
decide whether to use the debug malloc. 


not reject has been certified as “C code” 
no matter how blatantly illegal its con- 
tents may be toa language scholar. Fed 
this illegal not-C code, a tool’s C front- 
end will reject it. This problem is the 
tool’s problem. 

Compounding it (and others) the 
person responsible for running the 
tool is often not the one punished if the 
checked code breaks. (This person also 
often doesn’t understand the checked 
code or how the tool works.) In particu- 
lar, since our tool often runs as part of 
the nightly build, the build engineer 
managing this process is often in charge 
of ensuring the tool runs correctly. 
Many build engineers have a single con- 
crete metric of success: that all tools ter- 


minate with successful exit codes. They | 


see Coverity’s tool as just another speed 
bump in the list of things they must get 
through. Guess how receptive they are 
to fixing code the “official” compiler ac- 


cepted but the tool rejected with a parse | 


old compilers. While the languages 
these compilers accept have interest- 
ing features, strong concordance with 
a modern language standard is not one 


| of them. Age begets new problems. 


Realistically, diagnosing a compiler’s 
divergences requires having a copy of 
the compiler. How do you purchase a 
license for a compiler 20 versions old? 
Or whose company has gone out of 
business? Not through normal chan- 
nels. We have literally resorted to buy- 
ing copies off eBay. 

This dynamic shows up in a softer 
way with non-safety-critical systems; the 
larger the code base, the more the sales 
force is rewarded fora sale, skewing sales 
toward such systems. Large code bases 
take a while to build and often get tied to 
the compiler used when they were born, 
skewing the average age of the compilers 
whose languages we must accept. 

Ifdivergence-induced parse errors are 
isolated events scattered here and there, 


contributed articles 


make two different things the same 
typedef char int; 


(“Useless type name in empty decla- 
ration.”) 

And one where readability trumps 
the language spec 


unsigned x = Oxdead _ beef; 
(“Invalid suffix ‘_beef’ on integer 
constant.”) 
From the embedded space, creating 
a label that takes no space 
void x; 
(“Storage size of ‘x’ is not known.”) 
Another embedded example that 
controls where the space comes from 


unsigned x @ “text”; 
(“Stray ‘@’ in program.”) 


A more advanced case of a nonstan- 
dard construct is 


error? This tack of interest generally ex- 


they are responsible. 

Many (all?) compilers diverge from 
the standard. Compilers have bugs. Or 
are very old. Written by people who mis- 
understand the specification (not just 
for C++). Or have numerous extensions. 
The mere presence of these divergences 
causes the code they allow to appear. 


If a compiler accepts construct X, then | 


given enough programmers and code, 
eventually X is typed, not rejected, then 
encased in the code base, where the 
static tool will, not helpfully, flag it as a 
parse error. 

The tool can’t simply ignore diver- 
gent code, since significant markets 
are awash in it. For example, one enor- 
mous software company once viewed 
conformance as a competitive disad- 
vantage, since it would let others make 
tools usable in lieu of its own. Embed- 
ded software companies make great 
tool customers, given the bug aversion 
of their customers; users don’t like it if 
their cars (or even their toasters) crash. 
Unfortunately, the space constraints in 
such systems and their tight coupling 
to hardware have led to an astonishing 
oeuvre of enthusiastically used com- 
piler extensions. 

Finally, in safety-critical software 
systems, changing the compiler often 
requires costly re-certification. Thus, 
we routinely see the use of decades- 


' then they don’t matter. An unsound tool 
tends to any aspect of the tool for which | 


can skip them. Unfortunately, failure of- 
ten isn’t modular. Ina sad, too-common 
story line, some crucial, purportedly “C” 
header file contains a blatantly illegal 
non-C construct. It gets included by all 
files. The no-longer-potential customer 


is treated to a constant stream of parse | 


errors as your compiler rips through the 


customer’s source files, rejecting each | 
Among the most nettlesome troubles 


in turn. The customer’s derisive stance 
is, “Deep source code analysis? Your 
tool can’t even compile code. How can 
it find bugs?” It may find this event so 
amusing that it tells many friends. 


Tiny set of bad snippets seen in header | 
files. One of the first examples we en- | 


countered of illegal-construct-in-key- 
header file came up at a large network- 
ing company 


// “vedefinition of parameter ‘a’” 
void foo(int a, int a); 


The programmer names foo’s first 
formal parameter a and, in a form of 
lexical locality, the second as well. 
Harmless. But any conformant com- 
piler will reject this code. Our tool cer- 
tainly did. This is not helpful; compil- 


ing no files means finding no bugs, and | 


people don’t need your tool for that. 
And, because its compiler accepted it, 
the potential customer blamed us. 
Here’s an opposite, less-harmless 
case where the programmer is trying to 
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Intié ErrSetJump(ErrJumpBuf buf) 
= { 0x4E40 + 15, OxA085; } 


It treats the hexadecimal values of 
machine-code instructions as program 
source. 

The award for most widely used ex- 
tension should, perhaps, go to Micro- 
soft support for precompiled headers. 


is that the compiler skips all the text 
before an inclusion of a precompiled 
header. The implication of this behav- 
ior is that the following code can be 
compiled without complaint: 


I can put whatever I want here. 

It doesn’t have to compile. 

If your compiler gives an error, 
it sucks. 

#include 
header. h> 


<some-precompiled- 


Microsoft’s on-the-fly header fabri- 
cation makes things worse. 

Assembly is the most consistently 
troublesome construct. It’s already 
non-portable, so compilers seem to 
almost deliberately use weird syn- 
tax, making it difficult to handle in a 
general way. Unfortunately, if a pro- 
grammer uses assembly it’s probably 


| to write a widely used function, and 


if the programmer does it, the most 
likely place to put it is in a widely used 


NO. 2. COMMUNICATIONS OF THE ACM 71 


contributed articles 


header file. Here are two ways (out 
of many) to issue a mov instruction 


// First way 


foo() { 
_ asm mov eax, eab 
mov eax, eab; 

} 

// Second way 

#pragma asm 

—_ _asm [ mov eax, eab mov 

eax, eab | 


#pragma end asm 


The only thing shared in addition to 
mov is the lack of common textual keys 
that can be used to elide them. 

We have thus far discussed only C, a 


simple language; C++ compilers diverge | 


to an even worse degree, and we go to 
great lengths to support them. On the 
other hand, C# and Java have been eas- 
ier, since we analyze the bytecode they 
compile to rather than their source. 
How to parse not-C with a C front-end. 
OK, so programmers use extensions. 
How difficult is it to solve this problem? 
Coverity has a full-time team of some of 
its sharpest engineers to firefight this ba- 
nal, technically uninteresting problem 
as their sole job. They’re never done.” 
We first tried to make the problem 
someone else’s problem by using the 
Edison Design Group (EDG) C/C++ 
front-end to parse code.* EDG has 
worked on how to parse real C code 
since 1989 and is the de facto indus- 
try standard front-end. Anyone decid- 
ing to not build a homegrown front- 
end will almost certainly license from 
EDG. All those who do build a home- 
grown front-end will almost certainly 
wish they did license EDG after a few 
experiences with real code. EDG aims 
not just for mere feature compatibility 


but for version-specific bug compat- | 


ibility across a range of compilers. Its 
front-end probably resides near the 
limit of what a profitable company can 
do in terms of front-end gyrations. 
Unfortunately, the creativity of com- 
piler writers means that despite two de- 
cades of work EDG still regularly meets 


b Anecdotally, the dynamic memory-checking 
tool Purify’’ had an analogous struggle at the 
machine-code level, where Purify’s developers 
expended significant resources reverse engi- 
neering the various activation-record layouts 
used by different compilers. 
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defeat when trying to parse real-world 
large code bases.‘ Thus, our next step is 
for each supported compiler, we write 
a set of “transformers” that mangle 
its personal language into something 
closer to what EDG can parse. The 


rips out the offending construct. As 
one measure of how much C does not 
exist, the table here counts the lines of 
transformer code needed to make the 
languages accepted by 18 widely used 
compilers look vaguely like C. A line of 


| transformer code was almost always 


written only when we were burned to a 


degree that was difficulttoworkaround. | 


Adding each new compiler to our list of 
“supported” compilers almost always 


requires writing some kind of trans- | 


former. Unfortunately, we sometimes 
need a deeper view of semantics so are 
forced to hack EDG directly. This meth- 
od is a last resort. Still, at last count (as 
of early 2009) there were more than 


406(!) places in the front-end where we | 


hadan#ifdef COVERITY to handlea 
specific, unanticipated construct. 

EDG is widely used as a compiler 
front-end. One might think that for cus- 
tomers using EDG-based compilers we 
would be in great shape. Unfortunately, 
this is not necessarily the case. Even ig- 
noring the fact that compilers based on 
EDG often modify EDG in idiosyncratic 


ways, there is no single “EDG front- | 


end” but rather many versions and pos- 
sible configurations that often accept a 
slightly different language variant than 
the (often newer) version we use. Asa Si- 
syphean twist, assume we cannot work 
around and report an incompatibility. If 
EDG then considers the problem impor- 
tant enough to fix, it will roll it together 
with other patches into a new version. 
So, to get our own fix, we must up- 


¢ Coverity won the dubious honor of being the 
single largest source of EDG bug reports after 
only three years of use. 


grade the version we use, often caus- 
ing divergence from other unupgraded 


| EDG compiler front-ends, and more is- 


sues ensue. 
Social versus technical. Canwe get cus- 


| tomer source code? Almost always, no. 
most common transformation simply | 


Despite nondisclosure agreements, even 
for parse errors and preprocessed code, 
though perhaps because we are viewed 
as too small to sue to recoup damages. 
As a result, our sales engineers must 
type problems in reports from memory. 
This works as well as you might expect. 
It’s worse for performance problems, 
which often show up only in large-code 
settings. But one shouldn’t complain, 
since classified systems make things 
even worse. Can we send someone on- 
site to look at the code? No. You listen to 
recited syntax on the phone. 


Bugs 

Do bugs matter? Companies buy bug- 
finding tools because they see bugs as 
bad. However, not everyone agrees that 
bugs matter. The following event has 
occurred during numerous trials. The 
tool finds a clear, ugly error (memory 
corruption or use-after-free) in impor- 
tant code, and the interaction with the 
customer goes like thus: 

“So?” 

“Isn’t that bad? What happens if 
you hit it?” 

“Oh, it'll crash. We'll get a call.” 
[Shrug.] 

If developers don’t feel pain, they 
often don’t care. Indifference can arise 
from lack of accountability; if QA can- 
not reproduce a bug, then there is no 
blame. Other times, it’s just odd: 

“Is this a bug?” 

“I’m just the security guy.” 

“That’s not a bug; it’s in third-party 
code.” 

“A leak? Don’t know. The author left 
years ago...” 

No, your tool is broken; that is not 
a bug. Given enough code, any bug- 


Lines of code per transformer for 18 common compilers we support. 


160 QNX 280 HP-UX _ _ 285 pice.cpp | 

294 sun,java.cpp 384 st.cpp 334 cosmic.cpp 

421 inteL.cpp 457 sun.cpp 603 iccmsa.cpp $ 
629 bcc.cpp _ 673 diab.cpp 756 xlc.cpp 

912 ARM 914 GNU 1294 Microsoft 

1425 keil.cpp 1848 cw.cpp 1665 Metrowerks 
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finding tool will uncover some weird 
examples. Given enough coders, 
you'll see the same thing. The fol- 
lowing utterances were culled from 
trial meetings: 

Upon seeing an error report saying 
the following loop body was dead code 
foo(i = 1; i < 0; i++) 

deadcode 

“No, that’s a false positive; a loop ex- 
ecutes at least once.” 

For this memory corruption error 
(32-bit machine) 


imt al2), by 
memset(a, 0, 12); 

“No, [meant to do that; they are next 
to each other.” 

For this use-after-free 


free(foo); 
foo-sbar = <::% 
“NI 


| it’s not 


uncommon for 


to be viewed 


tool improvement 


_as “bad” or at 
least a problem. 
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| the work for you. The more people in 


the room, the more likely there is some- 
one very smart and respected and cares 
(about bugs and about the given code), 
can diagnose an error (to counter argu- 
ments it’s a false positive), has been 
burned by a similar error, loses his/her 
bonus for errors, or is in another group 
(another potential sale). 

Further, a larger results meeting 
increases the probability that anyone 
laid off at a later date attended it and 
saw how your tool worked. True story: 
A networking company agreed to buy 
the Coverity product, and one week 
later laid off 110 people (not because of 
us). Good or bad? For the fired people 
it clearly wasn’t a happy day. However, 
it had a surprising result for us at a 
business level; when these people were 
hired at other companies some sug- 
gested bringing the tool in for a trial, 
resulting in four sales. 

What happens when you can’t fix 


No, that’s-Ok; there is no matioc 
call between the free and use.” 

As a final example, a buffer overflow 
checker flagged a bunch of errors of the 


form 


unsigned p[4]; 


“No, ANSI lets you write 1 past the 
end of the array.” 

After heated argument, the program- 
mer said, “We'll have to agree to dis- 
agree.” We could agree about the dis- 
agreement, though we couldn’t quite 
comprehend it. The (subtle?) interplay 
between 0-based offsets and buffer siz- 
es seems to come up every few months. 

While programmers are not often 
so egregiously mistaken, the general 
trend holds; a not-understood bug 
report is commonly labeled a false 
positive, rather than spurring the pro- 
grammer to delve deeper. The result? 
We have completely abandoned some 
analyses that might generate difficult- 
to-understand reports. 

How to handle cluelessness. You can- 
not often argue with people who are 
sufficiently confused about technical 
matters; they think you are the one 
who doesn’t get it. They also tend to get 


emotional. Arguing reliably kills sales. | 


What to do? One trick is to try to orga- 
nize a large meeting so their peers do 
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| all the bugs? If you think bugs are bad 


enough to buy a bug-finding tool, you 
will fix them. Not quite. A rough heuris- 
tic is that fewer than 1,000 bugs, then 
fix them. More? The baseline is to re- 
cord the current bugs, don’t fix them 
but do fix any new bugs. Many compa- 
nies have independently come up with 
this practice, which is more rational 
than it seems. Having a lot of bugs usu- 
ally requires a lot of code. Much of it 
won't have changed in a long time. A 
reasonable, conservative heuristic is 
if you haven’t touched code in years, 
don’t modify it (even for a bug fix) to 
avoid causing any breakage. 

A surprising consequence is it’s not 
uncommon for tool improvement to be 


| viewed as “bad” or at least a problem. 


Pretend you are a manager. For anything 


| bad you can measure, you want it to di- 
| minish over time. This means you are 


improving something and get a bonus. 
You may not understand techni- 
cal issues that well, and your boss cer- 
tainly doesn’t understand them. Thus, 
you want a simple graph that looks like 


| Figure 1; no manager gets a bonus for 
| Figure 2. Representative story: At com- 


pany X, version 2.4 of the tool found 
approximately 2,400 errors, and over 
time the company fixed about 1,200 of 
them. Then it upgraded to version 3.6. 
Suddenly there were 3,600 errors. The 
manager was furious for two reasons: 


' One, we “undid” all the work his people 
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Figure 1. Bugs down over 
time = manager bonus. 


bad 


time 


had done, and two, how could we have 
missed them the first time? 

How do upgrades happen when 
more bugs is no good? Companies in- 
dependently settle on a small number 
of upgrade models: 

Never. Guarantees “improvement”; 

Never before a release (where it would 
be most crucial). Counterintuitively hap- 
pens most often in companies that be- 
lieve the tool helps with release quality 
in that they use it to “gate” the release; 


Never before a meeting. This is at least | 


socially rational; 

Upgrade, then roll back. Seems to hap- 
pen at least once at large companies; 
and 

Upgrade only checkers where they fix 
most errors. Common checkers include 
use-after-free, memory corruption, 
(sometimes) locking, and (sometimes) 
checkers that flag code contradictions. 

Do missed errors matter? If people 
don’t fix all the bugs, do missed errors 
(false negatives) matter? Of course not; 
they are invisible. Well, not always. 
Common cases: Potential customers 
intentionally introduced bugs into the 
system, asking “Why didn’t you find it?” 
Many check if you find important past 


bugs. The easiest sale is toa group whose 
code you are checking that was horribly 
burned by a specific bug last week, and 
you find it. If you don’t find it? No mat- 
ter the hundreds of other bugs that may 
be the next important bug. 

Here is an open secret known to bug 


finders: The set of bugs found by tool 


A is rarely a superset of another tool B, 
even if A is much better than B. Thus, 


| the discussion gets pushed from “A is | 


better than B” to “A finds some things, 
B finds some things” and does not help 
the case of A. 

Adding bugs can be a problem; los- 
ing already inspected bugs is always a 
problem, even if you replace them with 
many more new errors. While users 
know in theory that the tool is “not a 
verifier,” it’s very different when the tool 
demonstrates this limitation, good and 
hard, by losing a few hundred known er- 
rors after an upgrade. 

The easiest way to lose bugs is to add 
just one to your tool. A bug that causes 
false negatives is easy to miss. One such 
bug in how our early research tool’s 
internal representation handled array 
references meant the analysis ignored 
most array uses for more than nine 


| months. In our commercial product, 


blatant situations like this are prevent- 
ed through detailed unit testing, but un- 
covering the effect of subtle bugs is still 
difficult because customer source code 
is complex and not available. 


Churn 

Users really want the same result from 
run to run. Even if they changed their 
code base. Even if they upgraded the tool. 
Their model of error messages? Compil- 
er warnings. Classic determinism states: 
the same input + same function = same 


Figure 2. No bonus. 
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result. What users want: different input 
(modified code base) + different func- 
tion (tool version) = same result. As a 
result, we find upgrades to be a constant 
headache. Analysis changes can easily 
cause the set of defects found to shift. 
The new-speak term we use internally is 
“churn.” A big change from academia is 
that we spend considerable time and en- 
ergy worrying about churn when modify- 
ing checkers. We try to cap churn at less 
than 5% per release. This goal means 
large classes of analysis tricks are disal- 
lowed since they cannot obviously guar- 
antee minimal effect on the bugs found. 
Randomization is verboten, a tragedy 
given that it provides simple, elegant so- 
lutions to many of the exponential prob- 
lems we encounter. Timeouts are also 
bad and sometimes used as a last resort 
but never encouraged. 

Myth: More analysis is always good. 
While nondeterministic analysis might 
cause problems, it seems that adding 
more deterministic analysis is always 
good. Bring on path sensitivity! Theorem 
proving! SAT solvers! Unfortunately, no. 

At the most basic level, errors found 
with little analysis are often better than 
errors found with deeper tricks. A good 
error is probable, a true error, easy to di- 
agnose; best is difficult to misdiagnose. 
As the number of analysis steps increas- 
es, So, too, does the chance of analysis 
mistake, user confusion, or the per- 
ceived improbability of event sequence. 
No analysis equals no mistake. 

Further, explaining errors is often 
more difficult than finding them. A 
misunderstood explanation means the 
error is ignored or, worse, transmuted 
into a false positive. The heuristic we 
follow: Whenever a checker calls a com- 
plicated analysis subroutine, we have to 
explain what that routine did to the user, 
and the user will then have to (correctly) 
manually replicate that tricky thing in 
his/her head. 

Sophisticated analysis is not easy to 


| explain or redo manually. Compound- 


ing the problem, users often lack a 
strong grasp on how compilers work. 
A representative user quote is “‘Static’ 
analysis’? What’s the performance over- 
head?” 

The end result? Since the analysis 
that suppresses false positives-is invis- 
ible (it removes error messages rather 
than generates them) its sophistication 
has scaled far beyond what our research 


system did. On the other hand, the 
commercial Coverity product, despite 
its improvements, lags behind the re- 
search system in some ways because it 
had to drop checkers or techniques that 
demand too much sophistication on 
the part of the user. As an example, for 
many years we gave up on checkers that 
flagged concurrency errors; while find- 
ing such errors was not too difficult, ex- 
plaining them to many users was. (The 
PREfix system also avoided reporting 
races for similar reasons though is now 
supported by Coverity.) 

No bug is too foolish to check for. Giv- 
en enough code, developers will write 
almost anything you can think of. Fur- 
ther, completely foolish errors can be 
some of the most serious; it’s difficult to 
be extravagantly nonsensical in a harm- 
less way. We’ve found many errors over 
the years. One of the absolute best was 
the following in the X Window System: 


effort to achieve low false-positive rates 
in our static analysis product. We aim 
for below 20% for “stable” checkers. 
When forced to choose between more 
bugs or fewer false positives we typi- 
cally choose the latter. 

Talking about “false positive rate” is 
simplistic since false positives are not 
all equal. The initial reports matter in- 
ordinately; if the first N reports are false 
positives (N = 3?), people tend to utter 
variants on “This tool sucks.” Further- 


more, you never want an embarrass- 


ing false positive. A stupid false posi- 
tive implies the tool is stupid. (“It’s not 
even smart enough to figure that out?”) 
This technical mistake can cause so- 
cial problems. An expensive tool needs 
someone with power within a company 
or organization to champion it. Such 
people often have at least one enemy. 
You don’t want to provide ammunition 
that would embarrass the tool champi- 
on internally; a false positive that fits in 
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perience covered here was the work of 
many. We thank all who helped build the 
tool and company to its current state, 
especially the sales engineers, support 
engineers, and services engineers who 
took the product into complex environ- 
ments and were often the first to bear 
the brunt of problems. Without them 
there would be no company to docu- 
ment. We especially thank all the cus- 
tomers who tolerated the tool during 
its transition from research quality to 
production quality and the numerous 
champions whose insightful feedback 
helped us focus on what mattered. 
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0 && geteuid == 0) { 
ErrorF(“only root”); 
exit(1); 


} 


It allowed any local user to get root 
access’ and generated enormous press 
coverage, including a mention on Fox 
news (the Web site). The checker was 
written by Scott McPeak as a quick hack 
to get himself familiar with the system. It 
made it into the product not because of 
a perceived need but because there was 
no reason not to put it in. Fortunately. 
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False Positives 

False positives do matter. In our experi- 
ence, more than 30% easily cause prob- 
lems. People ignore the tool. True bugs 
get lost in the false. A vicious cycle starts 
where low trust causes complex bugs 
to be labeled false positives, leading to 
yet lower trust. We have seen this cycle 
triggered even for true errors. If people 
don’t understand an error, they label it 
false. And done once, induction makes 
the (n+1)th time easier. We initially 
thought false positives could be elimi- 


nated through technology. Because of | 


this dynamic we no longer think so. 
We've spent considerable technical 


d_ The tautological check geteuid 0 was in- 
tended to be geteuid() 0. In its current 
form, it compares the address of geteuid to 0; giv- 
en that the function exists, its address is never 0. 


a punchiine is really bad. — 


Conclusion 

While we've focused on some of the 
less-pleasant experiences in the com- 
mercialization of bug-finding prod- 
ucts, two positive experiences trump 
them all. First, selling a static tool has 


become dramatically easier in recent | 


years. There has been a seismic shift in 
terms of the average programmer “get- 
ting it.” When you say you have a static 
bug-finding tool, the response is no lon- 
ger “Huh?” or “Lint? Yuck.” This shift 
seems due to static bug finders being in 
wider use, giving rise to nice network- 
ing effects. The person you talk to likely 
knows someone using sucha tool, has a 
competitor that uses it, or has been ina 
company that used it. 

Moreover, while seemingly vacuous 
tautologies have had a negative effect 
on technical development, a nice bal- 
ancing empirical tautology holds that 
bug finding is worthwhile for anyone 
with an effective tool. If you can find 
code, and the checked system is big 
enough, and you can compile (enough 
of) it, then you will always find serious 
errors. This appears to be a law. We en- 
courage readers to exploit it. 
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Wena) atiae tt teen ee 
| The National Academy of Sciences recommends 
| what the U.S. government should do to help 

| maintain American IT leadership. 


_ BY ERIC BENHAMOU, JON EISENBERG, AND RANDY H. KATZ 


Assessing _ 
the Changing 
U.S. IT R&D 


Ecosystem 


THE U.S. NATIONAL Academy of Sciences was 
established by President Abraham Lincoln in 1863 
to provide unbiased assessment of scientific and 
technology policy issues facing the U.S. government 
(http://www.nas.edu). In 2006, the Academy’s 
Computer Science and Telecommunication Board, 
under the sponsorship of the U.S. National Science 
Foundation, established a committee of experts in 
the fields of IT research (Randy Katz, Ed Lazowska, 
Raj Reddy), venture investment (Eric Benhamou, 
David Nagel, Arati Prabhakar), economics of 
innovation and globalization (Andrew Hargadon, 
Martin Kenney, Steven Klepper), and labor and work- 
force issues (Stephen Barley, Lucy Sanders) to assess 
the effects of changes in the U.S. IT R&D ecosystem 
(http://sites.nationalacademies.org/CSTB/index. 
htm). The committee took as its charter to examine 
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the period from 1995 to the present, a 
time of rapid expansion and contrac- 
tion of the field in response to the In- 
ternet boom, a precipitous stock-mar- 
ket collapse, increased competition 
through globalization of the industry, 
and an economy-disrupting terrorist 
attack and subsequent aftershocks to 
the economy, the full implications of 
which have yet to be understood. 
While the committee’s study fo- 
cused on global developments and 


_ their implications, particularly for 


the U.S., the report, which was issued 


_ in February 2009, is of interest to re- 
| searchers and policymakers in all of 


the world’s IT-intensive economies, as 
the globalization of the industry accel- 
erates with the rise of India and China 
as major centers. 

Here, we summarize the report’s ob- 
servations, findings, and recommen- 
dations (http://www.nap.edu/catalog. 
php?record_id=12174), covering sev- 
eral main themes: 

> IT’s central role in the developed 
world; 

» The committee’s assessments of 
the IT R&D ecosystem, with inevitable 


| U.S.-centric view, calling for increased 


and balanced investment in IT research 
by the U.S. government, with support- 
ing investment in education and out- 
reach to develop a_high-skills/infor- 


| mation-technology-aware work force, 


a commitment to reduce the “friction” 
the IT industry faces in the U.S. econ- 
omy (such as in its highly litigious in- 
tellectual property system and regula- 
tory regimes whose compliance unduly 
burdens small companies significantly 
more than larger ones) and expanded 


_ commitment to develop a leading-edge 


U.S. IT infrastructure; and 

>» The American Recovery and Rein- 
vestment Act of 2009 and the actions 
taken by the Obama administration 
directly relevant to the reeommenda- 


| tions of the committee. 


IT Impact 


, Advances in IT and its applications 


represent a signal success for U.S. sci- 
entific, engineering, business, and 


government over the past 50 years. IT 


has transformed, and continues to | 


transform, all aspects of people’s lives 
in the developed world, with increas- 
ing effects on the developing world, 
including in commerce, education, 
employment, health care, manufac- 
turing, government, national security, 
communications, entertainment, sci- 


ence, and engineering. IT also drives | 


the overall economy, both directly (the 
IT sector itself) and indirectly (other 
sectors powered by advances in IT). 

To appreciate the magnitude and 
breadth of these effects, imagine 
spending a day in the developed world 
without IT. It would be a day without 
the Internet and all it enables: no diag- 
nostic medical imaging; automobiles 
without electronic ignition, antilock 
brakes, and electronic stability control, 
no digital media (wireless telephones, 
high-definition televisions, MP3 audio, 
DVD video, computer animation, and 
videogames); aircraft unable to fly and 
travelers unable to navigate with ben- 
efit of the global positioning system, 
weather forecasters with no models; 
banks and merchants unable to trans- 


fer funds electronically; factory auto- — 


mation unable to function; and the 
U.S. military without technological su- 
premacy. It would be, for most people, 
a “day the Earth stood still.” 

IT and its effect on the economy 
continue to grow in size and impor- 
tance. According to estimates of the 
U.S. government’s Bureau of Econom- 
ic Analysis (http://www.bea.gov/), for 
2006 the IT-intensive “information- 
communications technology produc- 
ing” industries accounted for about 4% 
of the U.S. economy but contributed 


over 14% of real gross domestic prod- | 


uct (GDP) growth. As a point of refer- 
ence, U.S. federal funding in fiscal year 
2008 for computer science research 
was around $3 billion, less than 0.025% 
of GDP. This substantial contribution 


to the economy reflects only a portion | 


of the overall long-term benefits from 
IT research investment. It is in all gov- 


ernments’ interest for these benefits to | 


continue to grow and accrue. 
Figure 1 is the “tire-tracks” diagram 


used to describe a large number of IT- | 


industry sectors that have grown into 
billion-dollar industries through a 
combination of university and indus- 
trial research. Creation of these sectors 


Figure 1. Examples of U.S.-government- 
products and industries, the so-called tit 


e-tracks diagram. 
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Figure 2. Key elements and relationships in the U.S. IT R&D ecosystem. 
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has been achieved, not just from the 
government investment in foundation- 
al research, but from a vibrant venture 


community able and willing to provide | 


the risk capital to fund the commercial- 
ization of new ideas from which such 
industrial sectors are able to grow. 


Assessing the Ecosystem 
The U.S. IT R&D ecosystem was the envy 


of the world in 1995; Figure 2 outlines | 


its essential elements: university and 
industrial research enterprises; emere- 
ing start-up companies and more ma- 
ture technology companies; the indus- 


try that finances innovative firms; and | 
the regulatory environment and legal | 


frameworks. From the perspective of 
IT, the U.S. enjoyed a strong industrial 
base, the ability to create and leverage 
new technological advances, and an ex- 
traordinary system for creating world- 
class technology companies. 

The period from 1995 to the present 
has been a turbulent one for the U.S. 
and the world, as characterized by: 

> Irrational exuberance for IT stocks 
and the NASDAQ bust (2000); 

>» Y2K and the development of the 
Indian software industry; 


> Aftereffects of the terror attacks of 


September 11, 2001; 

» Financial scandals and bankrupt- 
cies (2001); 

» Surviving after the bubble burst 
(2001-2004); 

» Recovery (2005-2007); and 
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> Global economic financial crisis 
(2008). 

These shocks took their toll, and in 
the view of the committee, government 
actions are necessary to sustain the 
U.S. IT R&D ecosystem. The U.S. gov- 
ernment should: 

> Strengthen the effectiveness of 
government-funded IT research; 

> Remain the strongest generator of 
and magnet for technical talent; 

» Reduce the friction that harms the 
effectiveness of the U.S. IT R&D ecosys- 
tem while maintaining other impor- 
tant political and economic objectives; 
and 

> Ensure that the U.S. has an infra- 
structure for communications, com- 
puting, applications, and services that 
enables U.S. IT users and innovators to 
lead the world. 

Potential high-value societal ben- 


efits of continued investment in IT in- | 


clude: 

> Safer, robotics-enhanced automo- 
biles; 

» A more scalable, manageable, se- 
cure, robust Internet; 

> Personalized and collaborative ed- 
ucational tools for tutoring and just-in- 
time learning; 

» Personalized health monitoring; 

» Augmented cognition to help peo- 
ple cope with information overload; 
and 

» IT-driven advances in all fields of 
science and engineering. 
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and Networks Wealth Care 
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Government and 
Industry Policies 
for R&D Funding 


Much has been learned from de- 
cades of experience about what con- 
stitutes an environment that fosters 
successful research and its transition 
to commercial formation. The follow- 
ing insights are extracted from a 2003 
Academy report from the National 
Research Council called Innovation in 
| Information Technology (The National 
Academies Press, Washington, D.C., 
2003  http://www.nap.edu/openbook. 
php?isbn=0309089808): 

On the results of research: 

>» U.S. international leadership in 
IT (vital to the country) springs from a 
deep tradition of research; 

>» The unanticipated results of re- 
search are often as important as the 
anticipated results; and 

» The interaction of research ideas 
multiplies their effect; for example, 
concurrent research programs target- 
ing integrated circuit design, computer 
graphics, networking, and workstation- 
based computing strongly reinforce 
and amplify one another. 

On research as a partnership: 

> The success of the IT research en- 
terprise reflects a complex partnership 
among government, industry, and uni- 
versities; 

>» The federal government has had 
and will continue to have an essential 
role in sponsoring fundamental re- 
search in IT (largely university-based) 
| because it does what industry cannot 

do. Industrial and governmental in- 


SCIENCES 


APH COURT 


PHOTOGR 


vestment in research reflects different 
motivations, resulting in differences in 
style, focus, and time horizon; 

>» Companies have little incentive to 
invest significant amounts in activi- 
ties whose benefits spread quickly to 


their rivals. Fundamental research of- | 


ten falls into this category. By contrast, 
the vast majority of corporate R&D ad- 
dresses product-and-process develop- 
ment; and 

>» Government funding for research 
has leveraged the effective decision 
making of visionary program manag- 
ers and program-office directors from 
the research community, empowering 
them to take risks in designing pro- 
grams and selecting grantees. Govern- 
ment sponsorship of research, espe- 
cially in universities, also helps develop 
the IT talent used by industry, universi- 
ties, and other parts of the economy. 

On the economic payoff of research: 

> Past returns on federal investment 
in IT research have been extraordi- 
nary for both U.S. society and the U.S. 
economy. The transformative effects 
of IT grow as innovations build on one 
another and as user know-how com- 
pounds. Priming that pump for tomor- 
row is today’s challenge; and 

>» When companies create products 
using the ideas and work force that re- 
sult from federally sponsored research, 
they repay the nation in jobs, tax reve- 
nue, productivity increases, and world 
leadership. 


Ultrascale scientific computing capability, like the world’s fastest supercomputer—Jaguar— 
in the U.S. Department of Energy’s Oak Ridge National Laboratory, is considered a top 


priority among government science funding. 


| Inevitable Globalization of IT 
Another significant trend, analyzed in 
| detail in the report, is how the IT in- 
dustry has become more globalized, 
especially with the dramatic rise of the 
economies of India and China, fueled 
in no small part by their development 
of vibrant IT industries. Moreover, |n- 
dia and China represent fast-growing 
markets for IT products, with both 
likely to grow their IT industries into 
economic powerhouses for the world, 
reflecting both deliberate government 
policies and the existence of strong, 
| vibrant private-sector firms, both co- 
mestic and foreign. Ireland, Israel, 
Japan, Korea, and Taiwan, as well 
| some Scandinavian countries, have 
also developed strong niches within 
the increasingly globalized industry. 
Today, a product conceptualized and 
marketed in the U.S. might be designed 
to specifications in Taiwan, and pati 
ies or hard drives obtained from Japan 
might become parts in a product as- 
sembled in China. High-value software 
and integrated circuits at the heart ofa 
product might be designed and devel- 
oped in the U.S., fabricated in Taiwan, 
and incorporated into a product as- 
'sembled from components supplied 
from around the world. 
Unfortunately, during a period of 
rapid globalization, U.S. national poli- 
cies have not sufficiently buttressed 
| the ecosystem or generated side effects 
that have reduced its effectiveness. 
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| This is particularly true of such areas 
as IT education, U.S. government IT 
research funding, and the regulations 
| that affect the corporate overhead and 
competitiveness of innovative IT com- 
| panies. As a result, the U.S. position in 
IT leadership today has eroded com- 
pared to that of prior decades, and IT 
leadership may pass to other nations 
within a generation unless the U.S. 
recommits itself to providing the re- 
sources needed to fuel U.S. IT innova- 
tion, removing important roadblocks 
that reduce the ecosystem’s effective- 
ness in generating innovation and the 
fruits of innovation, and becoming a 
lead innovator and user of IT. 
| In 2009, the ecosystem also faced 
new challenges from a global eco- 
nomic crisis that continues to unfold. 
There has been a marked reduction in 
the availability of venture capital fol- 
lowing losses in pension funds and 
endowments, as well as in initial pub- 
lic offerings by technology companies 
and a decline in mergers and acquisi- 
tions. There is also a steep decline in 
consumer confidence, suggesting that 
a consumer-driven recovery is unlikely 
in the near term. Significant layoffs 
and hiring cutbacks in IT firms and 
across the global economy seem all but 
certain to adversely affect the IT R&D 
ecosystem, undermining the partial re- 
covery seen over the past few years. The 
magnitude, duration, and enduring ef- 
fects on the ecosystem of the downturn 
are not yet clear. 

Globalization is a broad and sweep- 
ing phenomenon that cannot be con- 
tained. If embraced rather than resist- 
ed, the committee concluded that it 
| presents more opportunity than threat 
| to the U.S. national IT R&D ecosystem. 
To thrive in this landscape, the U.S. 
should play to its strengths, notably its 
continued leadership in conceptualiz- 
ing idea-intensive new concepts, prod- 
ucts, and services the rest of the world 
desires and where the greatest incre- 
ments of value-added are captured. 

Toward this end, it is necessary for 
the U.S. to have the best-funded and 
most-creative research institutions; 
develop and attract the best techni- 
cal and entrepreneurial talent among 
its own people, as well as those from 
around the world; make its economy 
the world’s most attractive for forming 
new ventures and nurturing small, in- 
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novative firms; and create an environ- 
ment that ensures deployment of the 
most advanced technology infrastruc- 
tures, applications, and services in the 
USS. itself for the benefit of the nation’s 
people, institutions, and firms. 


Findings and Recommendations 
Here, we describe the report’s findings 
and recommendations in the context 
of the four objectives for government 
action described earlier: 

Objective 1. Strengthen the effective- 
ness of federally funded IT research. 
University research is focused largely 
on basic research, while industrial re- 
search concentrates on applied R&D, 
meaning that much of the feedstock 
for long-term innovation is to be found 


in the nation’s universities. As a result, | 


support for university education and 
research is essential to generating the 
stream of innovations that nourish the 
rest of the ecosystem. Measures to en- 
hance the productivity of university re- 
search funding, as well as that of other 
R&D funding, would increase the pay- 
off from these investments. 

Although the advances in IT over the 
past 50 years have been breathtaking, 
the field remains in its relative infancy, 
and continuing advances over the com- 
ing decades can be expected but only as 
long as the IT R&D ecosystem’s capac- 


ity to sustain innovation is preserved | 


and enhanced. 

Current decisions about how the 
U.S. should make investments—both 
civilian and military—in basic IT re- 


search do not seem to reflect the full ef- 


fect of IT on society and the economy. 
The government’s own data indicates 
the U.S. lags behind Europe and Japan 
in civilian funding for IT R&D. Mean- 
while, the European Union and China 
have aggressive plans for strengthen- 
ing their global positions in IT through 
substantial and increasing IT R&D in- 
vestment. 
Regainingaleadingpositionrequires 


aggressive action, including ambitious | 


targets for increased R&D investment. 
It is appropriate and necessary for the 
U.S. to correspondingly adjust its own 
IT R&D spending level, just as individu- 


al businesses, following best practices, | 


track their global competitors’ busi- 
ness models to avoid falling behind 
in global market share. Increased fed- 
eral investment in IT research would 
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reflect the importance of IT to the na- 
tion’s society and economy as a whole 


_ and would allow the U.S. to build and 


sustain the already large positive effect 


| of IT on its economy. The desirability 


of increased federal investment in IT 
R&D was also recognized in a 2007 re- 
port by the National Academies, Rising 
Above the Gathering Storm: Energizing 
and Employing America for a Brighter 
Economic Future (http://www.nap.edu/ 
catalog._php?record_id=11463), and, to 
some extent, by provisions in the sub- 
sequently passed the America Com- 
petes Act of 2007 (http://thomas.loc. 
gov/cgi-bin/bdquery/z?d110:SN00761: 

@@@D&summ2=mMw&). Moreover, 
in its August 2007 report, the Presi- 
dent’s Council of Advisors on Science 
and Technology (PCAST, http://www. 
ostp.gov/pdf/nitrd_review.pdf) found 
an imbalance in the current federal 
R&D portfolio in that more long-term, 
large-scale, multidisciplinary R&D is 
needed. PCAST concluded that current 
interagency coordination processes for 


| networking and IT R&D are inadequate 


for meeting anticipated national needs 
and for maintaining U.S. leadership in 
an era of global competitiveness. 

A strategic reassessment of U.S. R&D 
priorities is needed, an analysis merit- 
ing the attention of first-tier scientists 
and engineers from academia, indus- 
try, and government. A strong focus on 
IT is important due to the special role 
of IT within science and engineering. 

Toward this end, a means of deliver- 
ing to the highest levels of the U.S. gov- 
ernment the best possible advice on 
the transformational power of IT would 
help ensure that the nation invests at 
appropriate levels in IT research and 
this investment is made as efficiently 
and as effectively as possible, in part 
through improved coordination of 
federal R&D investments. This advice 
could be provided in a number of ways, 
including augmentation of the current 
presidential science-and-technology 
advisory structure, establishment of a 


_ high-level IT adviser to the President 


of the U.S., or reestablishment of an IT- 


_ specific presidential advisory commit- 
_ tee (such as the President’s Informa- 


tion Technology Advisory Committee, 
which operated 1997-2005). 

Finding. A robust program of U.S. 
government-sponsored R&D in IT is vi- 
tal to the nation. 
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Computer scientists at Sandia National Laboratories successfully demonstrated the ability 
to run more than a million Linux kernels as virtual machines. The Sandia research, two years 
in the making, was funded by the Department of Energy’s Office of Science, the National 
Nuclear Security Administration’s Advanced Simulation and Computing program, and by 


internal Sandia funding. 


Finding. The level of U.S. government 
investment in fundamental research in 
IT continues to be inadequate. 

Recommendation. As the U.S. govern- 
ment increases its investment in long- 
term basic research in the physical sci- 
ences, engineering, mathematics, and 
information sciences, it should care- 
fully assess the level of investment in 
IT R&D, mindful of economic return, 
societal effect, enablement of discov- 
ery across science and engineering, 
and other benefits of additional effort 
in IT and should ensure that appropri- 
ate advisory mechanisms are in place 
to guide investment within the IT R&D 
portfolio. 

Objective 2. Remain the strongest gen- 
erator of and magnet for technical talent. 
There is cause for concern that an un- 
dersized, insufficiently prepared work 
force for the IT industry will accelerate 
migration of higher-value activities out 
of the U.S. While the committee did not 
address the entire array of technology- 


sector wage-and-job-security issues, it | 


is clear that without a work force that 
is knowledgeable with respect to tech- 
nology and that has sufficient numbers 
of highly trained workers, the U.S. will 
find it difficult to retain the most in- 
novation-driven parts of its own IT in- 


dustry. Despite demand for such work- 
ers, the number of students specifying 
their intention to major in computing 
and information sciences has dropped 
significantly since 2003. The problem 
of declining enrollment in the comput- 
ing disciplines (compared to projected 
demand) is compounded by the very 


Robotics technologies—for defense, education, and manufacturing among many other 


contributed articles 


low participation of underrepresented 
groups in IT. 

The U.S. should rebuild its national 
IT educational pipeline, encouraging 
all qualified students, regardless of 
race, gender, or ethnicity, to enter the 
discipline. Without sustained, ampli- 
fied intervention, the U.S. is unlikely 
to have an educational pipeline capa- 
ble of yielding a revived and diverse IT 
work force over the next 10 years. To 
achieve the needed revitalization, the 
U.S. should pursue a multipronged ap- 
proach: improve technology education 
at all levels from kindergarten through 
grade 12; broaden participation in IT 
careers by women, people with disabili- 
ties, and all minorities, particularly Af- 
rican-Americans, Hispanics, and Native 
Americans; and retain foreign students 
who have received advanced degrees 
in IT. Immigrants have been especially 
significant in high-technology entre- 
preneurship; for at least 25% of the U.S. 
engineering and technology companies 
started between 1995 and 2005, mostly 


| in software, innovation, and manufac- 


turing-related services, at least one key 
founder was born outside the U.S. 

Finding. Rebuilding the computing- 
education pipeline at all levels requires 
overcoming numerous obstacles, in 
turn portending significant challenges 
for the development of future U.S. IT 
work-force talent. 

Finding. Participation in IT of wom- 


applications—are always strong contenders for government funding. The FIRST Robotics 
Competition, here the Northstar Regional at the University of Minnesota, Minneapolis, 
challenged teams of young people and their mentors to solve a common problem using a 


standard “kit of parts.” 
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en, people with disabilities, and all es of increased friction in the conduct 


minorities is especially low and declin- 
ing. This low level of participation will 
affect the ability of the U.S. to meet its 
work-force needs and place it at a com- 


petitive disadvantage by not allowing it — 


to capitalize on the innovative thinking 
of half its population. 

Recommendation. To build such 
a skilled work force it needs to re- 
tain high-value IT industries, the U.S. 
should invest more in education and 
outreach initiatives to nurture and in- 
crease its IT talent pool. 


Finding. Although some IT profes- | 


sional jobs will be offshored, at the time 
the committee completed its report in 
2008, it found there were more ITjobs in 
the U.S. than at any time during the dot- 
com boom, even in the face of corporate 
offshoring trends. While this may no 
longer be true in the wake of the global 
recession, anecdotal evidence indicates 
strong demand for new graduates in 
computer science programs nationally. 

Recommendation. The U.S. should 
increase the availability and facilitate 
the issuance of work and residency vi- 


sas to foreign students who graduate | 
with advanced IT degrees from U.S. | 


educational institutions. 

Objective 3. Reduce friction that 
harms the effectiveness of the U.S. IT R&D 
ecosystem. Such factors as intellectual 
property litigation and corporate gover- 
nance regulations have become sourc- 


of business in the U.S. and can have the 
effect of making other countries more 


attractive places to establish the small, | 


innovative companies that are essential 
components of the ecosystem. These 
issues are not simple; for example, in 
terms of corporate governance, the 
dampening effects of increased regula- 
tion must be weighed against the ben- 
efits of restoring and maintaining pub- 
lic confidence in equity markets. But to 
keep the U.S. attractive for new venture 
formation and to sustain the nation’s 
unrivaled ability to transform innova- 
tive new concepts into category-defin- 
ing products and services the world 
desires, the potential effects on the IT 
R&D ecosystem should be weighed in 
considering new measures or reforms 
in such areas as corporate governance 
and intellectual property litigation. 

Finding. Fewer young, innovative IT 
companies are gaining access to U.S. 
public equity markets. 


Recommendation. Congress and fed- | 


eral agencies (such as the Securities 
and Exchange Commission http://www. 
sec.gov/ and the Patent and Trademark 
Office http://www.uspto.gov/) should 
consider the effect of both current and 
proposed policies and regulations on 
the IT ecosystem, especially on young, 
innovative IT businesses, and consider 
measures to mitigate them where ap- 
propriate. 


Phillip J. Bond, president of leading U.S. technology trade association TechAmerica, testified 
before the Senate Homeland Security and Government Affairs Subcommittee last April on 
advancing U.S. efforts toward the digital future. 
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Objective 4. Ensure that the U.S. has 
the infrastructure that enables U.S. IT us- 
ers and innovators to lead the world. The 
U.S. has long enjoyed the position of be- 
ing the largest market for IT; global de- 
mographics and relative growth rates 
suggest this advantage is unlikely to 
endure. Fortunately, although a healthy 
domestic IT market is an important ele- 
ment of a healthy domestic ecosystem, 
market size is not the only factor in 
leadership. The environment fostered 
by leading-edge users of technology 
(including those who can leverage re- 
search, innovate, and create additional 
value) creates the context for IT’s next 
wave and its effective application. In 
such an environment, all sectors of so- 
ciety (including consumers, business- 
es, and governments) exploit and make 
the best use of advanced IT. But there 
are indications that the U.S. has indeed 
lost its leadership in the use of IT. 

In particular, the U.S. broadband 
infrastructure is not as advanced or 
as widely deployed as that in many 
other countries. Should this situation 
persist, the U.S. will no longer be the 
nation in which the most innovative, 
most advanced technology and highest 
value-added products and services are 
conceptualized and developed. 

Moreover, in addition to broadly 
fostering research and commercial in- 
novation, government-sponsored R&D 
can help meet particular government 
demands. Though the government 
is no longer a lead IT user across the 
board, it continues to play an appropri- 


_ ate leadership role where federal-agen- 


cy requirements are particular to their 
missions and commercial analogs are 


scarce or nonexistent. 


| gigabit-broadband 


Finding. The most dynamic IT sec- 
tor is likely to be in the country with 
the most demanding IT customers and 
consumers. 

Finding. In terms of nationwide 
availability, use, and speed of broad- 
band, the U.S. (a former leader in the 
technology) has been losing ground 
compared with other nations. 

Recommendation. The U.S. should 
establish an ambitious target for re- 
gaining and holding a decisive lead in 
the broad deployment of affordable 
services. Federal 
and state regulators should explore 


models and approaches that reduce 


regulatory and jurisdictional bottle- 


necks and increase incentives for in- 
vestment in these services. 

Recommendation. Government (fed- 
eral, state, local) should foster com- 
mercial innovation and itself make 
strategic investments in IT R&D and 
deployment so the U.S. retains a global 
lead position in areas where it has par- 
ticular mission requirements. 


American Recovery and 
Reinvestment Act of 2009 

The recent global financial crisis has 
had a major effect on the U.S. IT in- 
dustry, which appeared to be recover- 
ing following the shocks of the early 
part of the decade; 2008 was the first 
year in almost two decades when there 
were no IT initial public offerings on 
U.S. stock exchanges. Even high-flying 
Internet companies like Google have 
sustained layoffs, though modest. The 
effect is not limited to the U.S.; the 
slowdown is evident throughout the 
globalized IT industry. 

The American Recovery and Rein- 
vestment Act of 2009 (http://www.irs. 
gov/newsroom/article/0,,id=204335,00. 
html) provides significant funding for 
infrastructure and research investment 
as part of the U.S. government’s recent 
economic stimulus package. The Act 
provides $2.5 billion for distance learn- 
ing, telemedicine, and broadband in- 
frastructure for rural communities. 
An additional $4.7 billion is available 
for broadband infrastructure proj- 
ects, including for expanding public 
access to computers, such as through 
community colleges and public librar- 
ies. And an additional $4.5 billion is 
available to upgrade the electric grid 
for enhanced electricity delivery and 
energy reliability, and will likely make 
extensive use of IT in the process. The 
Department of Energy recently estab- 
lished the Advanced Research Projects 
Agency-Energy 
gov/), with initial funding of $400 mil- 
lion; initial science and technology 
research awards were announced in 
February 2009. Another $2 billion is 


available to coordinate deployment of | 


advanced health IT. Finally, the Nation- 
al Science Foundation’s annual budget 
is being enhanced by $2.5 billion, es- 


sentially increasing its annual budget — 
by 50%. This provides a much-needed | 


increment in foundational research 
funding in IT. While this is intended as 


(http://arpa-e.energy. | 


The U.S. should 
play to its 
strengths, notably 
its continued 
leadership in 
conceptualizing 
idea-intensive 
new concepts, 
products, and 
services the rest 
of the world desires. 


FEBRUARY 2010 


contributed articles 


a one-time increase in funding, we are 
optimistic that the Obama administra- 
tion’s and Congress’s commitment to 
science and technology will continue. 
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quantum parts together, one gets a 
notion of the algorithm—the quan- 
tum algorithm—whose computation- 
al power appears to be fundamentally 
more efficient at carrying out certain 
tasks than algorithms written for 
today’s, nonquantum, computers. 
Could this possibly be true: that there 
is amore fundamental notion of algo- 


| rithmic efficiency for computers built 


from quantum components? And, if 
this is true, what exactly is the power 
of these quantum algorithms? 

The shot that rang round the compu- 
tational world announcing the arrival 
of the quantum algorithm was the 1994 
discovery by Peter Shor that quantum 
computers could efficiently factor nat- 
ural numbers and compute discrete 
logarithms." The problem of finding 
efficient algorithms for factoring has 
been burning the brains of mathema- 
ticians at least as far back as Gauss 
who commented upon the problem 
that “the dignity of science seems to 
demand that every aid to the solution 
of such an elegant and celebrated 
problem be zealously cultivated.” Even 
more important than the fact that 
such a simple and central problem has 
eluded an efficient algorithmic solu- 
tion is that the lack of such an efficient 
algorithm has been used as a justifica- 
tion for the security of public key cryp- 
tosystems, like RSA encryption.”’ Shor’s 
algorithm, then, didn’t just solve a 
problem of pure academic interest, but 
instead ended up showing how quan- 
tum computers could break the vast 
majority of cryptographic protocols in 
widespread use today. If we want the 
content of our publicly key encrypted 
messages to remain secret not only 
now, but also in the future, then Shor’s 
algorithm redefines the scope of our 
confidence in computer security: we 
communicate securely, today, given 
that we cannot build a large scale 
quantum computer tomorrow. 

Given the encryption breaking pow- 
ers promised by quantum comput- 
ers, it was natural that, in the decade 
following Shor’s discovery, research 
has focused largely on whether a 
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quantum computer could be built. 
While there currently appear to be no 
fundamental obstacles toward build- 
ing a large scale quantum computer 
(and even more importantly, a 
result known as the “threshold 
theorem”! '©'* 25 shows that quan- 
tum computers can be made resil- 
ient against small amounts of noise, 
thereby confirming that these are 
not analog machines), the engineer- 
ing challenges posed to build an RSA 
breaking quantum computer are 


severe and the largest quantum com- | 


puters built to date have less than 


10 quantum bits (qubits).'* ° But | 


regardless of the progress in build- 
ing a quantum computer, if we are to 
seriously consider our understanding 
of computation as being based upon 
experimental evidence, we will have 
to investigate the power of quantum 
algorithms. Christos Papadimitriou 
said in a recent interview’ that the 
theory of computational complexity is 
sucha difficult field because it is nearly 
impossible to prove what everyone 
knows from experience. How, then, 
can we even begin to gain an under- 
standing of the power of quantum 
computers if we don’t have one from 
which to gain such an experience? 
Further, and perhaps even more chal- 
lenging, quantum algorithms seem to 
be exploiting the very effects that make 
quantum theory so uniquely counter- 


intuitive.’ Designing algorithms for | 


a quantum computer is like building 
a car without having a road or gas to 
take it for a test drive. 

In spite of these difficulties, a 
group of intrepid multidisciplinary 


researchers have been tackling the | 


question of the power of quantum 
algorithms in the decades since Shor’s 
discoveries. Here we review recent 
progress on the upper bounding side 
of this problem: what new quantum 
algorithms have been discovered 
that outperform classical algorithms 
and what can we learn from these 
discoveries? Indeed, while Shor’s 
factoring algorithm is a tough act to 
follow, significant progress in quan- 
tum algorithms has been achieved. 
We concentrate on reviewing the 
more recent progress on this prob- 
lem, skipping the discussion of early 
(but still important) quantum algo- 


for searching (a quantum algorithm 
that can search an unstructured space 
quadratically faster than the best clas- 
sical algorithm), but explaining some 
older algorithms in order to set con- 
text. For a good reference for learn- 
ing about such early, now “classic” 
algorithms (like Grover’s algorithm 
and Shor’s algorithm) we refer the 
reader to the textbook by Nielsen and 
Chuang.*! Our discussion is largely 
ahistoric and motivated by attempt- 
ing to give the reader intuition as to 
what motivated these new quantum 
algorithms. Astonishingly, we will see 


that progress in quantum algorithms | 


has brought into the algorithmic fold 
basic ideas that have long been foun- 
dational in physics: interference, 
scattering, and group representation 
theory. Today’s quantum algorithm 
designers plunder ideas from physics, 
mathematics, and chemistry, weld 


| them with the tried and true methods 


of classical computer science, in order 
to build a new generation of quantum 
contraptions which can outperform 
their classical counterparts. 


Quantum Theory in a Nutshell 

Quantum theory has acquired a reputa- 
tion as an impenetrable theory acces- 
sible only after acquiring a significant 
theoretical physics background. One 
of the lessons of quantum comput- 
ing is that this is not necessarily true: 
quantum computing can be learned 


all, probabilities. When we observe 
a classical system, we will always find it 
to exist in one particular configuration 
(i.e. one particular binary string) with 
the probability given by the 2” num- 
bers in our probability distribution. 
Now let’s turn this approach to 
quantum systems, and consider a 
system made up of n qubits. Again, n 


| qubits will have a configuration which 


is just a length n binary string. When 
you observe 7 qubits you will only see 
an n bit configuration (thus when you 
hear someone Say that a qubit is both 
zero and one at the same time you 
can rely on your common sense tell 
them that this is absurd). But now, 
instead of describing our system by 2” 


_ probabilities, we describe a quantum 


without mastering vast amounts of | 


physics, but instead by learning a few 
simple differences between quantum 
and classical information. Before dis- 
cussing quantum algorithms we first 
give a brief overview of why this is true 
and point out the distinguishing fea- 
tures that separate quantum informa- 
tion from classical information. 

To describe a deterministic n-bit 
system it is sufficient to write down its 
configuration, which is simply a binary 
string of length n. If, however, we have 


n-bitsthatcan change according to pro- | 


babilistic rules (we allow randomness 
into how we manipulate these bits), we 
will instead have to specify the prob- 
ability distribution of the n-bits. This 
means to specify the system we require 
2" positive real numbers describing 
the probability of the system beingina 
given configuration. These 2" numbers 


system by 2” amplitudes. Amplitudes, 
unlike probabilities (which were 
positive real numbers and which 
summed to unity), are complex num- 
bers which, when you take their abso- 
lute value-squared and add them up, 
sum to unity. Given the 2” amplitudes 
describing a quantum system, if you 
observe the system, you will see a par- 
ticular configuration with a probabil- 
ity given by the modulus squared of 
the amplitude for that configuration. 
In other words, quantum systems are 
described by a set of 2” complex num- 
bers that are a bit like square roots of 
probabilities (see Figure 1). 

So far we have just said that there is 
this different description for quantum 
systems, you describe them by ampli- 
tudes and not by probabilities. But 
does this really have a consequence? 
After all the amplitudes aren’t used so 
far, except to calculate probabilities. 
In order to see that yes, indeed, it does 


im Figure 1. Classical versus 
quantum information, 


rithms such as Grover’s algorithm’? must sum to unity since they are, after | 
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On the left, the classical bit is described by two 
nonnegative real numbers for its probabilities 
Pr(O) = 1/3 and Pr(1) = 2/3. The quantum bit 

on the right, instead, has two complex valued 
amplitudes that give the (same) probabilities by 
taking the absolute value-squared of its entries. 
When a quantum system has such a description 
with nonzero amplitudes, one says that the 
system is in a superposition of the O and 1 
configurations. 


Ole 
aI 


Classical bit: Quantum bit: 


a 


have a profound consequence, we 
must next describe how to update our 
description of a system as it changes 
in time. One can think about this as 
analyzing an algorithm where informa- 
tion in our computing device changes 
with time according to a set of specific 
recipe of changes. 


For a classical probabilistic com- | 


puting device we can describe how 
it changes in time by describing the 
conditional probability that the sys- 
tem changed into a new configuration 
given that it was in an old configura- 
tion. Such a set of conditional prob- 
abilities means that we can describe 
a probabilistic computing action by 
a stochastic matrix (a matrix whose 
entries are positive and whose col- 
umns sum to unity). A classical proba- 


bilistic algorithm can then be viewed | 


as just a set of stochastic matrices 
describing how probabilities propa- 
gate through the computing device. 
If the classical probabilistic algo- 


rithm starts with n bits and ends with | 


m bits, then the stochastic matrix 
describing the algorithm will be a 2” 
by 2” matrix. 

What is the analogous procedure 


fora quantum system? Well instead of | 


specifying conditional probabilities of 
a new configuration given an old con- 
figuration, in a quantum system you 


need to specify the conditional ampli- | 
| got to 0 via 1 and the probability that 


tude of a new configuration given an 
old configuration. In the quantum 
world, the matrix of conditional ampli- 


tudes has two major differences from | 
_ algorithms. 


the classical probabilistic setting. The 


first is that quantum systems evolve | 


reversibly and thus the matrix is 2” by 
2" (corresponding to the amplitude 
of every configuration to change into 
any other configuration). The second 
is that, in order to preserve the sum 
of the squares of those amplitudes, 
which should be 1 throughout, this 


matrix is a unitary matrix, meaning | 


the entries of the matrix are complex 
numbers, and that the rows (and col- 
umns) of this matrix are orthonormal. 
Thus a quantum algorithm for a quan- 


tum system is given by a unitary matrix 


of conditional amplitudes. 
Whatconsequence does this change 
from probabilities to amplitudes 


and from stochastic matrices to uni- | 


tary matrices have for the notion of 
an algorithm? This is, of course, the 


essential question at hand when con- 
sidering quantum algorithms. In 
this survey we single out three major 
differences—quantum interference, 
the deep relationship between sym- 
metries and quantum mechanics, and 
quantum entanglement—and show 
how they are related to recent prog- 
ress in quantum algorithms. 


Interference and the Quantum 
Drunkard’s Walk 

The first of our claimed differences 
between quantum computers and clas- 
sical computers was that the former 
led to effects of quantum interference. 


What is interference and how can it | 


lead to new efficient algorithms? 
To illustrate the ideas of interfer- 
ence, consider a random walk on a 


line. The standard, classical drunk- | 


ard’s walk on a line refers to situation 
where the walker is allowed to step 
either forward or backward with equal 


| probability every unit time step. When 


starting at position 0 at time zero, 
then after one time step there is an 
equal probability to be at locations +1 
and -1. After the next time step, there 
is a one-fourth probability of being at 
positions —2 and 2 and one half prob- 
ability of being at position 0. Notice 
here that the probability of reaching 
zero was the sum of two probabili- 
ties: the probability that the drunkard 


it got to 0 via -1. Random walks on 
structures more complicated than a 
line are a well-known tool in classical 


Suppose that we want to construct 
a quantum version of this drunkard’s 
walk. To specify a quantum walk, we 
need, instead of a probability for tak- 
ing a step forward or backward, an 
amplitude for doing this. However we 
also need to make sure that the unitary 
nature of quantum theory is respected. 
For example, you might think that the 
quantum analogy ofa classical walkis to 
take a step forward anda step backward 
with amplitude one over the square root 
of two (since squaring this gives a prob- 
ability of one half). If we start at 0, then 
after one step this prescription works: 
we have equal amplitude of one over 
square root of two of being at either 
1 or -1. If we measure the walker after 
this first step, the probability of being 
at 1 or—1 is both one half. But if we run 
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this for another time step, we see that 
we have an amplitude of % to be at 
—2 or 2 and an amplitude 1 to be at 0. 
Unfortunately if we square these num- 
bers and add them up, we get a num- 
ber greater than unity, indicating that 
the evolution we have described is not 
unitary. 

The solution to this problem is to 
let the drunkard flip a quantum coin at 
each time step, after which he steps in 
the direction indicated by the quantum 
coin. What is a quantum coin? A quan- 
tum coin is simply a qubit whose two 
configurations we can call “forward” 
and “backward” indicating the direc- 
tion we are supposed to move after flip- 
ping the quantum coin. How do we flip 
such a coin? We apply a unitary trans- 
form. This unitary transform must 
specify four amplitudes. One choice of 
such a unitary transform that seems to 
mimic the drunkard’s walk is to assign 
all conditional amplitudes a value of 
one over the square root of two, with the 
exception of the amplitude to change 
from the configuration “forward” to 
the configuration “backward,” which, 
due to unitarity, we assign the ampli- 
tude negative one over square root of 
two. In other words the unitary trans- 
form we apply to flip the coin is speci- 
fied by the transition matrix 


N 


-s (1) 
V2 


If we follow this prescription for a 
quantum random walk with the drunk- 
ard initially positioned at zero, one 
quickly sees that something strange 
happens. Consider, for instance, the 
probability distribution formed by 
the quantum walk had we measured 
the walker’s position after three time 
steps (see Figure 2). Then the probabil- 
ity of gettingto+1 forthe drunkardis x. 
For a classical walk the similar num- 
ber would be %. What is going on 
here? Well if you trace back how he 
could have gotten to +1 in three steps, 
you'll see that there are three paths it 
could have used to get to this position. 
In the classical world each of these is 
traversed with equal probability, add- 
ing a contribution of % for each step. 
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Figure 2. Classical (top) and quantum (bottom) random walks. 


The probability of reaching a particular point in space and time, given 
that we measure the position at that time, is listed on the vertices. 
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But in the quantum world, two of these | 


paths contribute equal but oppositely 
to the amplitude to get to this position. 
In other words these two paths inter- 
fere with each other. Because ampli- 
tudes, unlike probabilities, don’t have 
to be positive numbers, they can add 
up in ways that cancel out. This is the 
effect known as quantum interfer- 
ence. It is the same interference idea 
which you see when two water waves 
collide with each other. But note an 
important difference here: ampli- 
tudes squared are probabilities. In 
water waves, the heights interfere, not 
anything related to the probabilities 
of the waves. This is the peculiar effect 
of quantum interference. 

Quantum random walks were actu- 
ally first described by physicists in 


1993,’ but only with the rise of interest | 


in quantum computers was it asked 
whether these walks could be used as 
a computational tool. An alternative, 
continuous time version of these algo- 
rithms (tacking more closely to ideas 
in physics) has also been developed 
by Farhi and Gutmann.’ Given these 
quantum random walks, a natural 
question is what does this have to do 
with algorithms? Well, the first obser- 
vation is that quantum random walks 
behave in strange ways. For instance a 
well-known property of classical ran- 
domwalks onaline is that the expected 
standard deviation of a random walk 


as a function of the number of steps | 


taken, T, scales like the square root 
of T. However, for a quantum random 
walk the standard deviation can actu- 
ally spread linearlywith 7. Remarkably, 


this difference has been well known | 
to physicists for a long time: it turns | 


out that the quantum random walk 
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defined above is closely related to the 
Dirac equation for a one-dimensional 


electron (the Dirac equation is a way | 


to get quantum mechanics to play 
nicely with the special theory of rela- 
tivity, and is a basic equation used in 
modern quantum field theory). This 
discovery that quantum algorithms 
seem to explore space quadratically 
faster than classical random walks has 
recently been shown to lead to quan- 
tum algorithms that polynomially out- 
perform their classical cousins. 

One example of an algorithm based 
upon quantum random walks is the 
algorithm for element distinctness 
due to Ambainis.* The element dis- 
tinctness problem is, given a function 
ffrom {1, 2,...,N} to {1, 2,...,N} deter- 
mine whether there exists two indices 


i #j such that f(i) = f(j). Classically | 


this requires Q(N) queries to the 
function f. Ambainis showed how a 
quantum random walk algorithm for 
this problem could be made to work 


| using O(N?) queries: an improvement 


which has not been achieved using 
any other quantum methods to date. 
Other algorithms that admit speed- 
ups of a similar nature by using quan- 
tum random walks are spatial search 
(searching a spatially d-dimensional 
space),’ triangle finding,” and verify- 
ing matrix products.’ Quantum ran- 
dom walks algorithms, then, are a 
powerful tool for deriving new quan- 
tum algorithms. 

These examples all achieved poly- 
nomial speedups over the best pos- 
sible classical algorithms. Given that 
quantum random walks can be used 
to polynomially outperform classical 


/ computers at some tasks, a natural 


question is whether quantum com- 
puters can be used to exponentially 
outperform classical computers. The 


| answer to this question was first given 


by Childs et al.,* who showed that a 
quantum random walk could traverse 
a graph exponentially faster than any 
possible classical algorithm walking 
on this graph. In Figure 3 we show the 
graph in question: the crux of the idea 
is that a quantum algorithm, by con- 
structively or destructively interfering, 
can traverse this graph, while a clas- 
sical algorithm will always get stuck 
in the middle of the graph. Construc- 


Figure 3. An example of a graph arising in the quantum random walk problem 


considered by Childs et al.® 


In this problem one is given access to a 
function that takes as input a vertex and 
returns a list of the vertex’s neighbors. The 
goal of the problem considered by Childs et 
al. is, by querying the function as few times 
as possible, traverse from the start vertex 
to the end vertex. The graphs considered 
are two full binary trees pasted together 
with a random cycle (in the example, the 
cycle resides inside the dashed box) whose 
roots are the start and end vertices. The 
quantum algorithm starts at the start 
vertex and then performs a quantum 
diffusion to the end vertex. The random 
cycle in the middle does not destroy this 
diffusion, since all paths contribute equally 
to this diffusion. For a graph of depth d, 

the quantum walk will find the end vertex 
by querying the local vertex function a 
polynomial number of times in d. The best 
classical algorithm can be shown to require 
querying the function for local vertex 
information exponentially many times in d. 
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Start 


tive interference refers to the condi- 
tion where quantum evolution causes 
amplitudes to increase in absolute 
magnitude (and hence in probability) 


while destructive interference refers to | 


where the evolution causes amplitudes 
to decrease in absolute magnitude 
(and hence decrease in probability). 
In spite of this success, the above 


problem, traversing this graph, does | 


not appear to have a good algorithmic 
use. Thus a subject of great research 
interest today is whether there are 
quantum random walk algorithms 


that offer exponential speedups over | 
classical algorithms for interesting | 


algorithmic problems. 


Quantum Algorithms 
and Game Playing 


Quantum interference, the ability of | 


multiple computational paths to add 
or detract amplitudes and thus lower 
and raise probabilities, is an effect 
well known to physicists. Given this, 
it is interesting to ask whether other 
techniques from physicists toolbox 
might also be of use in algorithms. 


A great example of this approach was _ 


the recent discovery by Farhi et al.'° of 
aquantumalgorithm thatoutperforms 
all possible classical algorithms for 
the evaluation of NAND tree circuits. 
Thisalgorithmwas derived,amazingly, 
by considering the scattering of wave 
packets off certain binary trees. As 
a quintessential physics experiment 


involves shootingone quantum system | 
at another and observing the resulting | 


scattered ‘outputs,’ physicists have 
developed a host of tools for analyz- 
ing such scattering experiments. It 
was this approach that led the above 
authors to the following important 
new quantum algorithm. 

To illustrate the NAND tree prob- 
lem consider the following two player 
game. The players are presented with a 
complete binary tree of depth k. On the 
leaves of the tree are labels that declare 


whether player A or player B wins by | 


getting to this node. At the beginning 
of a match, a marker is placed at the 
root of the tree. Players take alternat- 
ing turns moving this marker down a 


level in the tree, choosing one of the | 


two possible paths, with the goal, of 
course, of ending up at a leaf labeled 
by the player’s name. A natural ques- 
tion to ask is if it is always possible 


for player A, with its first move, to win 
the game. Evaluating whether this is 
the case can be deduced inductively 


in the following way. Suppose player | 


A makes the last move. Then player A 
will be able to win if the marker is on 
a node with at least one of its children 


labeled “A wins” hence we should label | 


such internal nodes with “A wins” as 
well. This line of reasoning holds in 
general for all internal nodes on which 


A makes a move: as soon as one of its | 


children has the label “A wins,” then 
the node inherits the same conclu- 
sion. On the other hand, if none of the 
children has this label, then we can 
conclude that “B wins.” Player B will, 
of course, be reasoning in a similar 
manner. Thus we can see that player A 
will win, starting from a node of height 
two, only if both of the children of the 
node lead to positions where A wins. 
We can then proceed inductively using 
this logic to evaluate whether player 
A can always win the game with a move 
originating from the root of the tree. 
If we label the leaves where player A 
wins by 1 and where player B wins by 0, 
then we can compute the value of the 


root node (indicating whether player | 


A can always win) by representing 
the interior layers of the tree by alter- 
nating layers of AND and OR gates. 


| Further it is easy to see that one can 


transform this from alternating layers 
of AND and OR gates to uniform layers 
of NAND (negated AND) gates, with a 
possible flipping of the binary values 
assigned to the leaves. 

We have just shown that the problem 


of evaluating whether the first player | 


has a series of moves that guarantees 
victory is equivalent to evaluating the 


value of a NAND tree circuit given a | 


labeling the leaves of the tree. Further, 
if the player can evaluate any interior 
value of the NAND tree, then one can 
then use this to actually win the game. 
If such a procedure is available one 
can simply use the algorithm to evalu- 
ate the two trees and if one of them is 
always a win, take that move. Thus the 
problem of evaluating the value of the 
NAND tree is of central importance for 
winning this game. The NAND tree is 
an example of the more general con- 
cept of a game tree which is useful for 
study of many games such as Chess and 
Go. In these later games, more than two 
moves are available, but a similar logic 
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for evaluating whether there is a win- 
ning strategy applies. This problem, 
of which the NAND tree circuit is the 
smallest example, is a central object in 
the study of combinatorial games. 

One can now ask: how costly is it 
to evaluate the NAND tree: how many 
nodes does one need to query in order 
to compute the value of the NAND tree? 
One could evaluate every leaf and com- 
pute the root, but certainly this is waste- 
ful: if you ever encounter a subtree 
which evaluates to 0, you know that the 
parent of this subtree must evaluate to 
1. A probabilistic recursive algorithm 
is then easy to think up: evaluate a sub- 
tree by first evaluating randomly either 
its left or right subtree. If this (left or 
right) subtree is 0, then the original 
subtree must have value 1. If not, evalu- 
ate the other subtree. This method, 
known as alpha-beta pruning, has a 
long history in artificial intelligence 
research. For the NAND tree, one can 
show that by evaluating about Q(N°”*’) 
of the N leaves one can calculate the 
value of the NAND tree with high prob- 
ability. It is also known that this value 
for the number of leaves needed to be 
queried is optimal. 

For a long period of time it was 
uncertain whether quantum comput- 
ers could perform better than this. 
Using standard lower bounding meth- 
ods, the best lower bound which could 
be proved was a O(N'”), yet no quan- 
tum algorithm was able to achieve 
such a speedup over the best classi- 
cal algorithm. Enter onto the scene 
the physicists Farhi, Goldstone, and 
Gutmann. These authors considered a 
continuous quantum random walk ofa 
strange form. They considered a quan- 
tum random walk on the graph formed 
by a binary tree (of size related to the 
NAND tree being evaluated) attached 
to a long runway (see Figure 4). They 
then showed how, if one constructed 
an initial quantum system whose ini- 
tial state was that of a quantum system 
moving to the right towards the binary 
tree, one could then obtain the value 
of the NAND tree by seeing whether 
such a quantum system scattered back 
off the binary tree, or passed through 


_ along the other side of the runway. The 
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time required to see this scattering or 
lack of scattering was shown to be pro- 
portional to O(N'”). In other words, the 
NAND tree could be evaluated by using 
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Figure 4. The NAND tree algorithm of Farhi, Goldstone, and Gutmann.?° 


First, a tree is constructed where the presence or absence of leaves at the top of the tree corresponds 
to the binary input values to the NAND tree problem. Next, a wavepacket is then constructed which, 

if the tree were not attached, would propagate to the right. When the tree is attached, as shown, the 
value of the NAND tree can be determined by running the appropriate quantum walk and observing 
whether the wave packet passes to the right of the attached tree or is reflected backwards. 


WW 


—_—_—_—_ > 
Wavepacket 


O(N") time by scattering a wave packet 
off of a binary tree representing the 
NAND tree problem. A few simple mod- 
ifications can bring this in line with the 
standard computer scientists defini- 
tion of a query algorithm for the NAND 
tree problem. Presto, out ofa scattering 
experiment, one can derive a quantum 
algorithm for the NAND tree problem 
which gives a O(N") algorithm outper- 
forming a classical computer science 
algorithm. Building upon this work, a 
variety of different trees with different 
branching ratios and degrees of being 
balanced have been explored show- 
ing quantum speedups. Indeed one 
remarkable aspect of much of this work 
is that while in many cases the classical 
versions of these problems do not have 
matching upper and lower bounds, in 
the quantum case matching upper and 
lower bounds can now be achieved. 


Finding Hidden Symmetries 
If interference is a quantum effect that 


leads to polynomial speedups, what | 


about the quantum algorithms that 
appear to offer exponential speedups, 
like in Shor’s algorithm for factoring or 
the quantum random walk algorithm 
of Childs et al. described here? Here it 
seems that just using interference by 
itself is not sufficient for gaining such 
extraordinary power. Instead, in the vast 
majority of cases where we have expo- 
nential speedups for quantum algo- 
rithms, a different candidate emerges 
for giving quantum computers power: 
the ability to efficiently find hidden 
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symmetries. Here we review recent 
progress in algorithms concerning 
hidden symmetries. In many respects 
these algorithms date back to the earli- 
est quantum algorithms, a connection 
we first briefly review, before turning 
to more modern ways in which this 
has influenced finding new quantum 
algorithms. 

We say an object has symmetry if 
“we can do something to it without 
changing it.” The things we can do are 
described by the elements of a group 
and the object itself is a function that 


Figure 5. The symmetries of a soccer ball. 


is defined on the same group. That 
this does not have to be as abstract as 
it seems is illustrated in Figure 5 for 
the group of three-dimensional rota- 
tions and the icosahedral symmetry of 
a soccer ball. 

Given a group G the symmetry of a 
function f defined on G can range from 
the trivial (when only the identity of G 
leaves f unchanged) to the maximum 
possible symmetry where f remains 
unchanged under all possible group 
operations. The most interesting cases 
happen when f is invariant under only 
a proper subgroup H of G and the task 
of finding this H, given f, is known as 
the hidden subgroup problem. For many 
different types of groups we know how 
to solve this problem efficiently on a 
quantum computer, while no classical 
algorithm can perform the same feat. 
We claim that this is because quantum 
computers can more efficiently exploit 
problems with hidden symmetries. 

To illustrate how quantum com- 
puters are better suited to deal with 
symmetries, let’s talk about the sim- 
plest symmetry one can talk about: the 
symmetry of flipping a bit. Consider 


| the operation X of negating a bit and 


the identity operation /. If we perform 
X twice, we obtain the operation J of 
doing nothing at all, which shows that 
I and X together form a group. Next, 
consider representing how J and X 


Of all the possible three-dimensional rotations that one can apply, only a finite number 
of them leave the image of a standard soccer ball unchanged. This subgroup, the icosahedral 
rotation group with its 60 elements, therefore describes the symmetries of the object; 
http://en.wikipedia.org/wiki/File: Trunc-icosa.jpg/ 
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operate on a classical probabilistic bit. | 
Such a binary system is described by | 
a two-dimensional vector of prob- 
abilities, corresponding to the prob- 
ability p, of being in 0 and p, of being 
in 1. The operations J and X can then 
be represented on this system as the 
two-by-two matrices 


1 0 O41 
r=[ ) ana x =[ ). 
0 1 420 


In group theoretic parlance, we say that | 
these two matrices form a represen- 
tation of the group, which effectively 
means that the multiplication among 
these matrices mimics the operation 
among the elements of the group that 
is being represented. 

But now notice how the matrices 
for J and X act on the vector space R?’. 
Naturally, the identity matrix J leaves 
all vectors unchanged, but the xX 
matrix acts in a more interesting way. 
If X acts on the symmetric vector [1, 1], 
then, like J, it preserves this vector. If, 
on the other hand, X operates upon 
the vector [1, -1], then it multiplies 
this vector by -1. This new vector 
[-1, 1] still sits in the one-dimensional 
subspace spanned by the original © 
[1, -1], but the direction of the vec- 
tor has been reversed. In other words, 
the act of flipping a bit can naturally | 
be represented down into its action 
upon two one-dimensional  sub- 
spaces: on the first of these the group 
always acts trivially, while on the other 
it always acts by multiplying by the 
scalar -1. Now we can see why clas- 
sical probabilistic information is at 
odds with this symmetry: while we 
can create a symmetric probability 
distribution [4,4] wherein the bit flip 
X preserves this distribution, we can- 
not create the other probability dis- 
tribution that transforms according 
to the multiplication by -1: doing so 
would require that we have negative 
probabilities. But wait, this is exactly | 
what the amplitudes of quantum 
computers allow you to do: to prepare 
and analyze quantum information in 
all the relevant subspaces associated 
with group operations such as flip- 
ping a bit. Unlike classical comput- 
ers, quantum computers can analyze 
symmetries by realizing the unitary 
transforms which directly show the 


effects of these symmetries. This, ina 
nutshell, is why quantum algorithms 
are better adapted to solve problems 
that involve symmetries. 

The idea that symmetry is the excel- 
sior of exponential quantum speed- 
ups now has considerable evidence in 
its favor and is one of the major moti- 
vators for current research in quan- 
tum algorithms. Shor’s algorithm 
for factoring works by converting the 
problem of finding divisors to that of 
finding periods of a function defined 
over the integers, which in turn is 
the problem of determining the trans- 
lational symmetries of this function. 


In particular Shor’s algorithm works | 


by finding the period of the function 
f(x) = r* mod N where r is a random 
number coprime with N, the number 
one wishes to factor. If one finds the 
period ofthis function, i.e. the smallest 
nonzero p such that f(x) = f(x + p), 
then one has identified a p such 


_ that x? = 1 mod N. If p is even (which 


happens with constant probability 
for random x), then we can express 
this equation as (024 1) (x2 1) = 0 
mod N. This implies that the greatest 
common divisor of x.?+ 1 and N or 
the greatest common divisor of x 2-4 
and N is a divisor of N. One can then 
use the Euclidean algorithm to find a 
factor of N (should it exist). Thus one 
can efficiently factor assuming one 
can find the period p of f(x). This fact 
was known before Shor’s discovery; 
the task of determining the period p is 
what requires a quantum computer. 
How then, can a quantum algo- 
rithm find the period p of a function 
f? The answer is: by exploiting the 


| just described friendly relationship 


between quantum mechanics and 
group theory. One starts with a system 
of two quantum registers, call them 
left and right. These are prepared into 
a state where with equal amplitude 
the left register contains a value x and 
the right register carries the corre- 
sponding function value f(x). The hid- 


den symmetry of this state is captured | 


by the fact that it remains unchanged 
if we would and p (or a multiple of p) 
to the left register; adding a non- 
multiple of p will, on the other hand, 
change the state. To extract this hid- 
den symmetry, let us view the ampli- 
tudes of the state as the values of 
a function from n bit strings to the 
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complex numbers. We would like to 
use a quantum version of the Fourier 
transform to extract the symmetry 
hidden in this function. Why the 
Fourier transform? The answer to 
this is that the Fourier transform is 


| intimately related to the symmetry of 


addition modulo N. In particular if we 
examine the process of addition where 
we have performeda Fourier transform 
before the addition and an inverse 
Fourier transform after the addition, 
we will find that it is now transformed 
from an addition into multiplication 
by a phase (a complex number z such 
that |z| = 1). Addition can be repre- 
sented on a quantum computer as 
a permutation matrix: a matrix with 
only a single one per column and row 
of the matrix. If we examine how such 
a matrix looks in the basis change 
given by the Fourier transform, then 
we see that this matrix only has entries 
on the diagonal of the matrix. Thus 
the Fourier transform is exactly the 
unitary transform which one can use 
to “diagonalize the addition matrix” 
with respect to the symmetry of addi- 
tion, which in turn is exactly the form 
of the symmetry needed for period 
finding. 

The output of the quantum Fourier 
transformation will reveal to us which 
symmetries the state has, and by re- 
peating this Fourier sampling a few 
times we will be able to learn the exact 
subgroup that the state hides, thus giv- 
ing us the period p (and hence allowing 
us to factor). Crucially the quantum 
Fourier transform can be implemented 
on a number of qubits logarithmic in 
the size of the addition group, log N, 
and in a time polynomial in log N as 
well. If one were to attempt to mimic 
Shor’s algorithm on a classical com- 


| puter, one would need to perform a 


Fourier transform on N classical pieces 
of data, which would require N log N 
time (using the fast Fourier transform). 
In contrast, because Shor’s quantum 
algorithm acts on quantum ampli- 
tudes, instead of on classical con- 
figuration data, it leads to an efficient 
quantum algorithm for factoring. 

This symmetry analysis results 
from the basics of the theory of group 
representation theory: symmetries are 
described by groups, and the elements 
of these groups can be represented by 
unitary matrices. This is something 
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that classical probabilistic computers | category. Among them, however, are the 


cannot exploit: the only way to repre- 
sent a group on a classical computer 
is to represent it as by deterministic 
permutation. But while a group can 
be represented by unitary matrices, no 
such representation is possible using 
stochastic matrices. This, at its heart, | 
appears to be one of the key reasons | 
that quantum computers offer expo- 
nential benefits for some problems 
over classical computers. 

Given that Shor’s algorithm exploits 
symmetry in such a successful way, it 
is natural to search for other problems | 
thatinvolve hiddensymmetries. Follow- 
ing Shor’s discovery it was quickly 
realized that almost all prior quantum 
algorithms could be cast in a unifying 
form as solving the hidden subgroup 
problem for one group or the other. | 
For Shor’s algorithm the relevant group 
is the group of addition modulo N. 
For the discrete logarithm problem the 
relevant group is the direct product | 
of the groups of addition modulo N. 
Indeed it was soon discovered that 
for all finite Abelian groups (Abelian 
groups are those whose elements all 
commute with each other) quantum 
computers could efficiently solve the 
hidden subgroup problem. A natural 
follow-up question is: what about the 
non-Abelian hidden subgroup prob- 
lem? And, even more importantly, 
would such an algorithm be useful for 
any natural problems, as the Abelian | 
hidden subgroup problem is useful for 
factoring? 

One of the remarkable facts about 
the problem of factoring is its inter- 
mediate computational complexity. 
Indeed, if one examines the decision 
version of the factoring problem, one 
finds that this is a problem which is in | 
the complexity class NP and in the com- 
plexity class Co-NP. Because of this fact 
it is thought to be highly unlikely that it 
is NP-complete, since ifit were, then the | 
polynomial hierarchy would collapse in 
a way thought unlikely by complexity 
theorists. On the other hand, there is 
no known classical algorithm for fac- 
toring. Thus factoring appears to be of 
Goldilock’s complexity: not so hard as 
to revolutionize our notion of tractabil- 
ity by being NP-complete, but not so 
easy as to admit efficient classical solu- 
tion. There are, surprisingly, only a few 
problems which appear to fit into this | 
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problems of graph isomorphism and 
certain shortest-vector in a lattice prob- 
lems. Might quantum computers help 
at solving these problems efficiently? 


Soon after Shor’s algorithm was | 


phrased as a hidden subgroup prob- 
lem, it was realized that if you could 
efficiently solve the hidden subgroup 
problem over the symmetric group 


(the group of permutations of 7 objects), — 


then you would have an efficient quan- 
tum algorithm that solves the graph 
isomorphism problem. Further, Regev” 
showed how the hidden subgroup prob- 
lem over the dihedral group (the group 
of symmetries of a regular polygon 


_ where one can not only rotate but also 


FEBRUARY 2010 


flip the object) relates to finding short 
vectors in a high dimensional lattice. 
Hence a hypothetical efficient quantum 
algorithm for this dihedral case could 
be used to solve such shortest vector 
problems. This in turn would break the 
public key cryptosystems that are based 
upon the hardness of these lattice prob- 
lems, which are among the very few 
cryptosystems not broken by Shor’s 
algorithm. As a result of these observa- 
tions about the non-Abelian hidden 
subgroup problem, designing quantum 
algorithms for such groups has become 
an important part of the research in 
quantum computation. While a certain 
amount of progress has been achieved 


(by now we know of many non-Abelian 
groups over which the hidden subgroup 
problem can be solved efficiently), this 
problem remains one of the outstand- 
ing problems in the theory of quantum 
algorithms. 

At the same time, going back to 
the Abelian groups, there has been 
quite some success in finding new 
applications of the quantum algorithm 
for the Abelian hidden subgroup prob- 
lem, besides factoring and discrete log- 
arithms. Hallgren™ showed that there 
exists a quantum algorithm for solving 
Pell’s equation (that is, finding integer 
solutions x, y to the cubic equation x? — 
dy’ =1, see Table 1), while Kedlaya’’ has 
described a quantum procedure that 
efficiently counts the number points 
of curves defined over finite fields. 
Furthermore, other efficient quantum 
algorithm has been found for, among 
other problems, determining the struc- 
ture of black box groups, estimating 
Gauss sums, finding hidden shifts, and 
estimating known invariants. 


Simulating Quantum Physics 

A final area in which quantum algo- 
rithms have made progress goes back 
to the very roots of quantum computing 
and indeed of classical computing itself. 
From their earliest days, computers 
have been put to use in simulating phys- 
ics. Among the difficulties that were 


Table 1. Some examples of integer solutions (x, y) to Pell’s equation x? — dy? =1 


for different values d. 


Such solutions tell us what the units are of the number field Q[+ Va] (the rational numbers extended 
with the irrational + Vd) and thereby solve the unit group problem. Hallgren's result shows how this 
problem can be solved efficiently on a quantum computer, while no such algorithm is known for 


classical computers. 


d x y 

2 3 2 

3 2 1 

5 9 4 

13 649 180 

14 45 4 

6,009 1,316,340,106,327,253,158 1,698,114,661,157,803,451 
9,259,446,951,059,947,388 6,889,492,378,831,465,766 
4,013,975 = 1.3 x 10% 81,644 ~ 1.6 x 10% 

6,013 40,929,908,599 527,831,340 
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soon encountered in such simulations 
was that quantum systems appeared to 
be harder to simulate than their classi- 
cal counterparts. But, of course, some- 
how nature, which obeys quantum 
theory, is already carrying out “the sim- 
ulation” involved in quantum physics. 
So, if nature is carrying out the simula- 
tion, then should we be able to design 
a computer that also can perform this 
simulation? This was in fact the seed of 
the idea that led to the original notion of 
quantum computing by Feynman." 

To put this in perspective, consider 
the problem of simulating classical 
physics. The miracle of reproducing 
classical physics on a classical com- 
puter is that you can use many ‘par- 
ticles’ with small state spaces (bits) to 
mimic a few particles that have very 
large state spaces. For this to be pos- 
sible it is required that the number of 
bit configurations, 2% PS), is at 
least as big as the number of possible 
states of the physical system (which 
is the size of the particle’s state space 
exponentiated with the number of 
particles). As a result, we can simulate 
the solar system on a laptop. 

Quantum computing does the same 
thing for quantum mechanical systems; 
now 2(rumber of quits) ig the dimension of 
the state space and it allows us to simu- 
late other quantum physical systems 
that consists of few particles with expo- 
nentially large state spaces. Here how- 
ever, it appears essential that we rely 
on quantum computing components 
in order to simulate the truly quantum 
mechanical components of a physical 
system. A crucial question therefore is: 


One of the exciting findings in studying 
this problem was that a small quantum 
computer, consisting of only a few hun- 
dred qubits, could already outperform 
the best classical algorithms for this 
problem. This small number makes it 
likely that among the first applications 
of a quantum computer will not be fac- 
toring numbers, but instead will be in 
simulating quantum physics. Indeed, 
we believe thata quantum computer will 
be able to efficiently simulate my possi- 
ble physical system and that it therefore 
has the potential to have a huge impact 
on everything from drug discovery to the 
rapid development of new materials. 


Conclusion 
The discovery that quantum comput- 
ers could efficiently factor is, even 


_ today, difficult to really appreciate. 


which physical systems are interesting _ 


to simulate in such a manner? 

While the complete answer to this 
question is not known, a deeper look 
at quantum algorithms for simulating 
quantum physics is now being under- 
taken in several places. As an example, a 
group of physical chemists have recently 
compared howuseful quantum comput- 
ers would be for computing the energy 
level structure of molecular systems.’ 
This is a classical problem of physical 
chemistry, and our inability to perform 
these calculations robustly for large 
molecules is a bottleneck in a variety 
of chemical and biological applica- 
tions. Could quantum computers help 
for solving this problem and outper- 
forming the best classical algorithms? 


There are many ways to get out of the 
conundrum posed by this discovery, 
but all of these will require a funda- 
mental rewriting of our understanding 
of either physics or computer science. 
One possibility is that quantum com- 
puters cannot be built because quan- 
tum theory does not really hold as a 
universal theory. Although disappoint- 
ing for quantum computer scientists, 
such a conclusion would be a major 
discovery about one of the best tested 
physical theories—quantum theory. 
Perhaps there is a classical algorithm 
for efficiently factoring integers. This 
would be a major computer science 
discovery and would blow apart our 
modern public key cryptography. Or 
perhaps, just perhaps, quantum com- 
puters really are the true model of 
computing in our universe, and the 
rules of what is efficiently computable 
have changed. These are the dreams 
of quantum computer scientists look- 
ing for quantum algorithms on the 
quantum machines they have yet to be 
quantum programmed. iC 
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Strange Effects 
in High Dimension 


By Sanjoy Dasgupta 


IN STUDYING THE genetic basis of a dis- 
ease, it is now common to select a set 
of relevant genes G, and to measure 
how strongly they are expressed in cell 
samples from a group of patients, some 
healthy and some ill.! The expression 
level of a gene is mapped to a value in 


Even a solid ball—the simplest of ob- 
jects—has unusual aspects in high di- 
mension. For d=2 or d=3, a set of points 
picked at random from the unit ball Ba= 
{xin R*: ||x|| <1} will have some signifi- | 


_ cant fraction near the origin, say within 


[-1,1]. Each patient's datais thenavector | 


with one entry per gene, or equivalently, 
a point in R'°!. The size of G is frequently 
in the thousands, which makes the data 
high-dimensional by present standards. 

Analyzing data in a high-dimen- 
sional space R° is rife with challenges. 
First, many statistical estimators are 
accurate only when the number of data 
points (call it 2) is orders of magnitude 
larger than d. Some models, such as 


histograms, have error rates of the form | 


n‘“, so that to halve the error, you need 
2" times as much data: bad news even 
for double-digit d. A second difficulty is 
computational, as typified by the ever- 
important 2-means clustering problem: 
divide a data set into two groups, each 
with a designated center, to minimize 
the average squared distance from data 
points to their closest centers. Naive 
approaches run in time n‘, which is as- 
tronomical even for smallish d, and NP- 


hardness results have dampened hopes | 


for dramatically better algorithms. As 
a result, such problems are attacked 
with heuristics that offer no guarantees 
on their solutions. Understanding the 


quality of these schemes is doubly tricky | 


because they are often justified on in- 
tuitive grounds, inevitably informed by 
experiences of 2D and 3D space—while 
high dimensional spaces are full of 
strange and counterintuitive effects. 
Such complications are collectively 
referred to as the curse of dimension: A 


| dependent vectors. But if we only need 


superstition that the murky realm of | 


high dimension brings bad luck. But 
mathematical analysis is starting to 
clear away the haze. A pleasant surprise 
is that the counterintuitive geometry of 
high-dimensional space, once properly 
characterized, can be exploited to defeat 
other facets of the curse. 
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distance % of it. But we wouldn't find 
even one such point in high dimension 
unless we were to draw exponentially 
many points from the ball. This is be- 
cause points at distance r < 1 from the 
origin constitute an r“ fraction of By, and 
this fraction goes rapidly to zero with 
rising dimension. For large d, the sup- 
posedly filled-in ball By is in fact well ap- 
proximated by a thin, hollow shell, {x in 
R*: 1- €< ||x|| <1} fore = O(1/d). 

Here’s another curiosity. Suppose we 
need lots of vectors in R‘ that are orthog- 
onal (at right angles) to each other. How 
many can we get? Exactly d, because this 
is the maximum number of linearly in- 


uperstitio 
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them approximately orthogonal—with 
angles that need not be exactly 90 de- 
grees, but 90+e—then we can find ex- 
ponentially many vectors. A collection 
of exp(O(«? d)) vectors picked at random 
from the surface of By will with high 
probability satisfy the angle constraint. 

These examples hint at the strange- 
ness of high-dimensional space. How- 
ever, such effects do not directly help 
with data analysis because they pertain 
to very specialized sets of points—those 
chosen randomly from the unit ball— 
whereas real data sets might look differ- 
ent. The trick is to take an arbitrary data 
set and then add randomness to it in 
such a way that the outcome is helpful 
and predictable. 

An early groundbreaking result of 
this kind was Dvoretsky’s theorem.’ Let K 
be a convex body in R’: for instance, the 
feasible region of a linear program with 
d variables. K could be enormously com- 
plicated. But a random slice through K 
(ofappropriate dimension) will with high 
probability be almost ellipsoidal. More 
precisely, it will contain an ellipsoid E 
and be contained within (1+0(1))E. 

A more recent result is the Johnson- 
Lindenstrauss theorem.’ Take any n 
points in Euclidean space of arbitrarily 
high dimension. If they are projected 
into a random subspace of O(log n) di- 
mensions, distances between points 
will with high probability be almost per- 
fectly preserved. Since clustering and 
other forms of data analysis frequently 


| depend only on interpoint distances, 


the dimension of data sets can automat- 
ically be reduced to O(log n). 

The following paper by Nir Ailon and 
Bernard Chazelle entitled “Faster Di- 
mension Reduction” demonstrates an 
ingenious variant of this theorem that 
permits the projection to be achieved 
especially fast. 
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Faster Dimension Reduction 


By Nir Ailon and Bernard Chazelle 


Abstract 

Data represented geometrically in high-dimensional vec- 
tor spaces can be found in many applications. Images and 
videos, are often represented by assigning a dimension 
for every pixel (and time). Text documents may be repre- 
sented in a vector space where each word in the diction- 
ary incurs a dimension. The need to manipulate such data 
in huge corpora such as the web and to support various 


query types gives rise to the question of how to represent | 


the data in a lower-dimensional space to allow more space 
and time efficient computation. Linear mappings are an 
attractive approach to this problem because the mapped 
input can be readily fed into popular algorithms that oper- 


ate on linear spaces (such as principal-component analy- | 


sis, PCA) while avoiding the curse of dimensionality. 

The fact that such mappings even exist became known 
in computer science following seminal work by Johnson 
and Lindenstrauss in the early 1980s. The underlying 
technique is often called “random projection.” The com- 
plexity of the mapping itself, essentially the product of a 
vector with a dense matrix, did not attract much attention 
until recently. In 2006, we discovered a way to “sparsify” 
the matrix via a computational version of Heisenberg’s 
Uncertainty Principle. This led to a significant speedup, 
which also retained the practical simplicity of the stan- 
dard Johnson-Lindenstrauss projection. We describe 
the improvement in this article, together with some of its 
applications. 


1. INTRODUCTION 


Dimension reduction, as the name suggests, is an algo- | 


rithmic technique for reducing the dimensionality of data. 
From a programmer’s point of view, a d-dimensional array 
of real numbers, after applying this technique, is repre- 
sented by a much smaller array. This is useful because 
many data-centric applications suffer from exponential 
blowup as the underlying dimension grows. The infamous 
curse of dimensionality (exponential dependence of an 
algorithm on the dimension of the input) can be avoided 
if the input data is mapped into a space of logarithmic 
dimension (or less); for example, an algorithm running 
in time proportional to 2% in dimension d will run in lin- 
ear time if the dimension can be brought down to log 
d. Common beneficiaries of this approach are cluster- 
ing and nearest neighbor searching algorithms. One 
typical case involving both is, for example, organizing a 
massive corpus of documents in a database that allows 
one to respond quickly to similar-document searches. 


The clustering is used in the back-end to eliminate (near) | 


duplicates, while nearest-neighbor queries are processed 
at the front-end. Reducing the dimensionality of the 
data helps the system respond faster to both queries and 


data updates. The idea, of course, is to retain the basic 
metric properties of the data set (e.g., pairwise distances) 
while reducing its size. Because this is technically 
impossible to do, one will typically relax this demand and 
tolerate errors as long as they can be made arbitrarily 
small. 

The common approaches to dimensionality reduction 
fall into two main classes. The first one includes data- 
aware techniques that take advantage of prior information 
about the input, principal-component analysis (PCA) and 
compressed sensing being the two archetypical examples: 
the former works best when most of the information in 
the data is concentrated along a few fixed, unknown direc- 
tions in the vector space. The latter shines when there 
exists a basis of the linear space over which the input can 
be represented sparsely, i.e., as points with few nonzero 
coordinates. 

The second approach to dimension reduction includes 
data-oblivious techniques that assume no prior information 
on the data. Examples include sketches for data streams, 
locality sensitive hashing, and random linear mappings 
in Euclidean space. The latter is the focus of this article. 
Programmatically, it is equivalent to multiplying the input 


| array by a random matrix. We begin with a rough sketch of 


the main idea. 

Drawing on basic intuition from both linear alge- 
bra and probability, it may be easy to see that mapping 
high-dimensional data into a random lower-dimensional 
space via a linear function will produce an approximate 
representation of the original data. Think of the direc- 
tions contained in the random space as samples from 
a population, each offering a slightly different view of 
a set of vectors, given by their projection therein. The 
collection of these narrow observations can be used to 
learn about the approximate geometry of these vectors. 
By “approximate” we mean that properties that the task 
at hand may care about (such as distances and angles 
between vector) will be slightly distorted. Here is a small 


example for concreteness. Let a,,...,a@, be independent 
random variables with mean 0 and unit variance 
(e.g., a Gaussian N(0, 1)). Given a vector x =(x,, ..., X,), con- 


sider the inner product Z =», a,x,; the expectation of Z is 
0 but its variance is precisely the square of the Euclidean 
length of x. The number Z can be interpreted as a “random 
projection” in one dimension: the variance allows us to 
“read off” the length of x. By sampling in this way several 
times, we can increase our confidence, using the law of 


A previous version of this paper appeared in Proceedings 
of the 38th ACM Symposium on Theory in Computing (May 
2006, Seattle, WA). 
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large numbers. Each sample corresponds to a dimension. | 
The beauty of the scheme is that we can now use it to han- 
dle many distances at once. 

It is easy to see that randomness is necessary if we hope 
to make meaningful use of the reduced data; otherwise 
we could be given as input a set of vectors belonging to 
the kernel of any fixed matrix, thus losing all information. 
The size of the distortion as well as the failure probability are 
user-specified parameters that determine the target (low) 
dimension. How many dimensions are sufficient? Careful — 
quantitative calculation reveals that, if all we care about 
is distances between pairs of vectors and angles between 
them—in other words, the Euclidean geometry of the data— | 
then arandom linear mapping to a space of dimension loga- 
rithmic in the size of the data is sufficient. This statement, 
which we formalize in Section 1.1, follows from Johnson 
and Lindenstrauss’s seminal work.”? The consequence 
is quite powerful: If our database contains 1 vectors in | 
d dimensions, then we can replace it with one in which data 
contains only log dimensions! Although the original paper | 
was not stated in a computational language, deriving a naive 
pseudocode for an algorithm implementing the idea in that 
paper is almost immediate. This algorithm, which we refer 
to as JL for brevity, has been studied in theoretical com- 
puter science in many different contexts. The main theme | 
in this study is improving efficiency of algorithms for high- 
dimensional geometric problems such as clustering,*’ near- 
est neighbor searching,’ and large scale linear algebraic 
computation. '* 28:3, 36 38 

For many readers it may be obvious that these algorithms 
are directly related to widely used technologies such as web 
search. For others this may come as a surprise: Where does 
a Euclidean space hide in a web full of textual documents? 
It turns out that it is very useful to represent text as vectors | 
in high-dimensional Euclidean space.*' The dimension in 
the latter example can be as high as the number of words 
in the text language! 

This last example illustrates what a metric embedding is: 
a mapping of objects as points in metric spaces. Computer 
scientists care about such embeddings because often it is 
easier to design algorithms for metric spaces. The sim- 
pler the metric space is, the friendlier it is for algorithm 
design. Dimensionality is just one out of many measures of 
simplicity. We digress from JL by mentioning a few impor- 
tant results in computer science illustrating why embed- 
ding input in simple metric spaces is useful. We refer the 
reader to Linial et al.”” for one of the pioneering works in 
the field. 


* The Traveling Salesman Problem (TSP), in which 
one wishes to plan a full tour of a set of cities, with 
given costs of traveling between any two cities is | 
an archetype of a computational hardness which 
becomes easier if the cities are embedded in a metric | 
space,'‘andespeciallyinalow-dimensional Euclidean 
one,’ 

* Problems such as combinatorial optimization on 
graphs become easier if the nodes of the graph can be 
embedded in /, space. (The space is defined to be the | 
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set R¢ endowed with a norm, where the norm of a vector 
x (written as ||x||,) is given by (24, |x,|’)"”. The metric is 
given by the distance between pairs of vectors x and y 
taken to be ||x - Jil,) 

* Embedding into tree metrics (where the distance 
between two nodes is defined by the length of the path 
joining them) is useful for solving network design opti- 
mization problems. 


The JL algorithm linearly embeds an input which is 
already in a high-dimensional Euclidean space /% into a 
lower-dimensional /* space for any p > 1, and admits a naive 
implementation with O(dk) running time per data vector; in 
other words, the complexity is proportional to the number 
of random matrix elements. 

Our modification of JL is denoted FJLT, for Fast- 


| Johnson-Lindenstrauss-Transform. Although JL often works 


well, it is the computational bottleneck of many applica- 
tions, such as approximate nearest neighbor searching.*" ”’ 
In such cases, substituting FJLT yields an immediate 
improvement. Another benefit is that implementing FJLT 
remains extremely simple. Later in Section 3 we show how 
FJLT helps in some of the applications mentioned above. 
Until then, we concentrate on the story of FJLT itself, which 
is interesting in its own right. 


1.1. A brief history of a quest for a faster JL 
Before describing our result, we present the original JL 
result in detail, as well as survey results related to its com- 


| putational aspects. We begin with the central lemma 


behind JL.” The following are the main variables we will be 
manipulating: 

X—a set of vectors in Euclidean space (our input data- 
set). In what follows, we use the term points and vectors 
interchangeably. 

n—the size of the set X. 

d—the dimension of the Euclidean space (typically 
very big). 

k—the dimension of the space we will reduce the points 
in X to (ideally, much smaller than d). 

€—a small tolerance parameter, measuring to what is 
the maximum allowed distortion rate of the metric space 
induced by the set X in Euclidean m-space (the exact defini- 
tion of distortion will be given below). 

In JL, we take k to be ce’ log n, for some large enough 


Figure 1. Embedding a spherical metric onto a planar one is no easy 
task. The latter is more favorable as input to printers. 


absolute constant c. We then choose a random subspace of 
dimension k in R“ (we omit the mathematical details of what 
a random subspace is), and define ® to be the operation of 
projecting a point in R“ onto the subspace. We remind the 
reader that such an operation is linear, and is hence equiva- 
lently representable by a matrix. In other words, we've just 
defined a random matrix. Denote it by ®. 

The JL Lemma states that with high probability, for all 
pairs of points x, y ¢ X simultaneously, 


k k 
fetx-b (1-€)s \ox-oyi,s Eix-yh, (i+€). (1) 


This fact is useful provided that k < d, which will be implied 
by the assumption 


n= 2082 (2) 

Informally, JL says that projecting the 7 points on a ran- 
dom low-dimensional subspace should, up to a distortion of 
1+ ¢, preserve pairwise distances. The mapping matrix of size 
@® = kx dcan be implemented in a computer program as 
follows: The first row is a random unit vector chosen uni- | 
formly in R%; the second row is a random unit vector from 
the space orthogonal to the first row; the third is a random 
unit vector from the space orthogonal to the first two rows, 
etc. The high-level proof idea is to show that for each pair 
x, y € X the probability of (1) being violated is order of 1/n’. 
A standard union bound over the number of pairs of points 
in X then concludes the proof. 

It is interesting to pause and ask whether the JL theorem 
should be intuitive. The answer is both yes and no. Low- 
dimensional geometric intuition is of little help. Take an 
equilateral triangle ABC in the plane (Figure 2), no matter 
how you project it into a line, you get three points in a row, 
two of which form a distance at least twice the smallest one. 
The distortion is at least 2, which is quite bad. The problem 
is that, although the expected length of each side’s projec- 
tion is identical, the variance is high. In other words, the 
projected distance is rarely close to the average. If, instead 
of d= 2, we choose a high dimension d and project down to 
k=ce” logn dimensions, the three projected lengths of ABC 
still have the same expected value, but crucially their (iden- 
tical) variances are now very small. Why? Each such length 
(squared) is a sum of k independent random variables, so its 
distribution is almost normal with variance proportional to 
k(this isa simple case of the central limit theorem). This fact 
alone explains each factor in the expression for k: €* ensures 
the desired distortion; log 7 reduces the error probability to 


Figure 2. A triangle cannot be embedded onto a line while 
simultaneously preserving distances between all pairs of vertices. 


A 
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n°, for constant c’ growing with c, which allows us to apply a 
union bound over all|"| pairs of distances in X. 

Following Johnson and Lindenstrauss,” various research- 
ers suggested simplifications of the original JL design and of 
their proofs (Frankl and Maehara,” DasGupta and Gupta,"” 
Indyk and Motwany”’). These simplifications slightly change 


| the distribution from which ® is drawn and result in a bet- 


ter constant cand simpler proofs. These results, however, do 


| not depart from the original JL from a computational point 


of view, because the necessary time to apply ® to a vector is 
still order of nk. 

A bold and ingenious attempt to reduce this cost was 
taken by Achlioptas.' He noticed that the only property of 


| ® needed for the transformation to work is that (®, - x)* be 


tightly concentrated around the mean 1/d for all unit vectors 
x € R‘, where ®. is the ith row of ®. The distribution he pro- 
posed is very simple: Choose each element of ® uniformly 
from the following distribution: 


v3/d with probability 1/6; 
0 2/3; 
—v3/d 1/6. 


The nice property of this distribution is that it is relatively 
sparse: on average, a fraction 2/3 of the entries of ® are 0. 
Assuming we want to apply ® on many points in R“ in a real- 
time setting, we can keep a linked list of all the nonzeros of 
® during preprocessing and reap the rewards in the form of 
a threefold speedup in running time. 

Is Achlioptas’s result optimal, or is it possible to get a 
super constant speedup? This question is the point of depar- 
ture for this work. One idea to obtain a speedup, aside from 
sparsifying ®, would be to reduce the target dimension k, and 
multiply by a smaller matrix ®. Does this have a chance of 
working? A lower bound of Alon’ provides a negative answer 
to this question, and dashes any hope of reducing the num- 
ber of rows of ® by more than a factor of O(log(1/e)). The 
remaining question is hence whether the matrix can be 
made sparser than Achlioptas’s construction. This idea has 
been explored by Bingham and Mannila."' They considered 
sparse projection heuristics, namely, fixing most of the 
entries of ® as zeroes. They noticed that in practice such 
matrices ® seem to give a considerable speedup with little 
compromise in distortion for data found in certain appli- 
cations. Unfortunately, it can be shown that sparsifying ® 
by more than a constant factor (as implicitly suggested in 
Bingham and Mannila’s work) will not work for all inputs. 
Indeed, a sparse matrix will typically distort a sparse vector. 
The intuition for this is given by an extreme case: If both ® 
and the vector x are very sparse, the product ®x may be null, 
not necessarily because of cancellations, but more simply 
because each multiplication ®,,x; is itself zero. 


1.2. The random densification technique 
In order to prevent the problem of simultaneous sparsity 


| of ® and x, we use a central concept from harmonic analy- 
| sis known as the Heisenberg principle—so named because 


it is the key idea behind the Uncertainty Principle: a signal 
and its spectrum cannot be both concentrated. The look of 
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frustration on the face of any musician who has to wrestle 
with the delay from a digital synthesizer can be attributed to 
the Uncertainty Principle. 


Before we show how to use this principle, we must stop | 


and ask: what are the tools we have at our disposal? We may 


write the matrix ® as a product of matrices, or, algorithmi-_ 


cally, apply a chain of linear mappings on an input vector. 
With that in mind, an interesting family of matrices we can 
apply to an input vector is the orthogonal family of d-by-d 


matrices. Such matrices are isometries: The Euclidean geom- | 


etry suffers no distortion from their application. 

With this in mind, we precondition the random k-by-d map- 
ping with a Fourier transform (via an efficient FFT algorithm) 
in order to isometrically densify any sparse vector. To prevent 
the inverse effect, i.e., the sparsification of dense vectors, 
we add a little randomization to the Fourier transform (see 
Section 2 for details). The reason this works is because sparse 


vectors are rare within the space of all vectors. Think of them | 


as forming a tiny ball within a huge one: if you are inside the 
tiny ball, arandom transformation is likely to take you outside; 
on the other hand, if you are outside to begin with, the trans- 
formation is highly unlikely to take you inside the tiny ball. 


The resulting FJLT shares the low-distortion characteris- | 


tics of JL but with a lower running time complexity. 


2. THE DETAILS OF FJLT 
In this section we show how to construct a matrix ® drawn 
from FJLT and then prove that it works, namely: 


1. It provides a low distortion guarantee. (In addition to 
showing that it embeds vectors in low-dimensional (*, 
we will show it also embeds in /*.) 

2. Applying it to a vector is efficiently computable. 


The first property is shared by the standard JL and its vari- 
ants, while the second one is the main novelty of this work. 


2.1. Constructing ® 


We first make some simplifying assumptions. We may | 


assume with no loss of generality that d is a power of two, 
d=2">k, and that nO d=Q(e”); otherwise the dimension 
of the reduced space is linear in the original dimension. Our 
random embedding ® ~ FJLT (, d, €, p) is a product of three 
real-valued matrices (Figure 3): 


® = PHD. 


| SS Se TA tea SS EY ES EEE IE ES 
Figure 3. FJLT. 
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The matrices P and D are chosen randomly whereas H is 
deterministic: 


* P is a k-by-d matrix. Each element is an independent 
mixture of 0 with an unbiased normal distribution of 
variance 1/q, where 


p-2 P 
g-mino 98"), 


In other words, P,, ~ N(O, 1/q) with probability q, and 
P,,=Owith probability 1 - q. 
° Hisad-by-d normalized Walsh-Hadamard matrix: 


H,= d¥2(-1)-»s-0 
Uj ’ 


where (i, j) is the dot-product (modulo 2) of the m-bit 
vectors i, / expressed in binary. 

* D is a d-by-d diagonal matrix, where each D,, is drawn 
independently from {-1, 1} with probability 1/2. 


The Walsh-Hadamard matrix corresponds to the discrete 
Fourier transform over the additive group GF (2)*: its FFT is 
very simple to compute and requires only O(d log d) steps. 
It follows that the mapping ®x of any point x € R¢ can be 
computed in time O(d log d + |P|), where |P| is the number 
of nonzero entries in P. The latter is O(e~* log 7) not only on 
average but also with high probability. Thus we can assume 
that the running time of O(d log d + qde” log n) is worst-case, 
and not just expected. 

The FJLT Lemma. Given a fixed set X of n points in R4, €< 1, and 
p {1,2}, draw a matrix ® from FJLT. With probability at least 
2/3, the following two events occur: 


1. Foranyx «xX, 
(1-e€)a@,|| x ||, <|| Ox ||, <A+e)a,I|x|L, 


where a, =kV2n" and a, =k. 


2. The mapping @: R’ — R* requires 
O(dlogd + min{de™ logn, e?“* log?” n}) 
operations. 


Remark: By repeating the construction O(log (1/6)) times we 
can ensure that the probability of failure drops to 6 for any 
desired 6> 0. By failure we mean that either the first or the 
second part of the lemma does not hold. 


2.2. Showing that ® works 

We sketch a proof of the FJLT Lemma. Without loss of gen- 
erality, we can assume that €< €, for some suitably small ¢,. 
Fix x « X. The inequalities of the lemma are invariant under 
scaling, so we can assume that ||x||, = 1. Consider the random 


| variable u=HDx, denoted by (u,, ..., u,)". The first coordinate 


u, is ofthe form Ue Xp where each a,=+d” is chosen inde- 
pendently and uniformly. We can use a standard tail esti- 
mate technique to prove that, with probability at least, say, 
0.95, 


max || HDx||,,= o(d*? flogn). (3) 


It is important to intuitively understand what (3) 
means. Bounding ||HDx||,, is tantamount to bounding 
the magnitude of all coordinates of HDx. This can be 
directly translated to a densification property. To see 
why, consider an extreme case: If we knew that, say, 
\|HDx\||,, < 1, then we would automatically steer clear of 
the sparsest case possible, in which x is null in all but one 
coordinate (which would have to be 1 by the assumption 
ell, = I!#Dx|, = 1). 

To prove (3), we first make the following technical 
observation: 


E le’) = ae [ey] = []coshtVdx;) < ef alii : 


Setting ¢ = sd above, we now use the technical observation 
together with Markov’s inequality to conclude that, for 
any s>0, 
Pr[|u, |>s]=2Pr[e™ >e"] 
< 2E[e™] le*@ < ges dik/2-s°a 


=2e*@? <1/(20nd), 


for s=@(d"’./logn). A union bound over all nd < n’ coor- 


dinates of the vectors {HDx|x € X} leads to (3). We assume | 


from now on that (3) holds with s as the upper bound; in 
other words, ||2d||,, < s, where uw = HDx. Assume now that uw is 
fixed. It is convenient (and immaterial) to choose s so that 
ms” is an integer. 

It can be shown that ||z||, = |||], by virtue of both H and 
D (and their composition) being isometries (i.e., preserve /, 
norms). Now define, 


Y=(Ypyor yp)? =Pu= Ox. 


The vector y is the final mapping of x using ®. It is useful 
to consider each coordinate of y separately. All coordinates 
share the same distribution (though not as independent 
random variables). Consider y,. By definition of FJLT, it is 
obtained as follows: Pick random i.i.d. indicator variables 
b,, ..., by, where each b, equals 1 with probability q; then draw 
random i.i.d. variables r,,...,7, from N(0, 1/q). Set y,= ot rn bu, 
and let Z= x bu’. It can be shown that the conditional vari- 
able (y,|Z = z) is distributed N(0, z/q) (this follows a well 
known fact known as the 2-stability of the normal distri- 
bution). Note that all of y,,...,y, are iid. (given uw), and we 
can similarly define corresponding random i.i.d. variables 
Z(= Z), Z,,.-.,Z,. It now follows that the expectation of Z 
satisfies: 


d 
E[Z]=) uj Eb =¢. (4) 
j=l 
Let w? formally denote (w?,..., v7) € (R*)*. By our assump- 


tion that (3) holds, w? lies in the d-dimensional polytope: 


1 
Pay) 050, <—-and 
m 


fl 


Let u* € R¢ denote a vector such that w* is a vertex of P. By 
symmetry of these vertices, there will be no loss of generality 
in what follows if we fix: 


1/2 -1/2 
USM ecg Ht Obes O)s 


m d-m 


The vector u* will be convenient for identifying extremal 
cases in the analysis of Z. By extremal we mean the most 
problematic case, namely, the sparsest possible under 
assumption (3) (recall that the whole objective of HD was to 
alleviate sparseness). 

We shall use Z* to denote the random variable Z corre- 
sponding to the case u=u*. We observe that Z* ~ m‘B(m, q); 
in words, the binomial distribution with parameters m, q 


| divided by the constant m. Consequently, 


var(Z*)=q(1—q)/m. 


In what follows, we divide our discussion between the /, 
and the /, cases. 


The £, case: We choose 


q=min{1/(em), 1}= min{o( 8"), i. 


| We now bound the moments of Z over the random bs. 


LEMMA 1. For any t> 1, E[Z‘] = O(qt)', and 


(1-e),/q <EWZ]<J¢. 


| Proof: The case g= 1 is trivial because Z is constant and equal 


to 1. So we assume gq = 1/(em) < 1. It is easy to verify that E[Z‘] 
is a convex function of wu’, and hence achieves its maximum 
at a vertex of P. So it suffices to prove the moment upper 
bounds for Z*, which conveniently behaves like a (scaled) 
binomial. By standard bounds on the binomial moments, 


E[Z *‘] = O(m™ (mqt)')=Q(qt)'; 


proving the first part of the lemma. 
By Jensen’s inequality and (4), 


E[VZ]< VEIZ]=,/@. 


This proves the upper-bound side of the second part of the 
lemma. To prove the lower-bound side, we notice that E|VZ 


| isa concave function of uw’, and hence achieves its minimum 


when u = u*. So it suffices to prove the desired lower bound 
for E| Jz *|. Since Vx >1+ 5-1) forall x =0, 


(6) 
> Ja{1+terg—11-ez*Iq-1"}. 
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By (4), E[Z*/g -1]=0 and, using (5), 


E[(Z */q-1) ] = var (Z * /q)=(1—q)/(qm) 
<1/(qm)=e. 


Plugging this into (6) shows that E[VZ*]>J/q(1-€), as 
desired. * 


Since the expectation of the absolute value of N(0, 1) 
is V2n", by taking conditional expectations, we find 
that 


E[|y,]=/2/qn ElVZ]. 


On the other hand, by Lemma 1, we note that 


(1-e€)V2/n <E[]y,[)<V2/n. (7) 


Next, we prove that |ly||, is sharply concentrated around its 
mean E[]ly||,] = E[|y, |]. To do this, we begin by bounding the 
moments of |y,| = |2y,u,|. Using conditional expectations, 
we can show that, for any integer t> 0, 


E[| y, {J=El(q'Z)"JE[|U/1, 


where U ~ N(0,1). It is well known that E[|U|‘] = (0; and so, 
by Lemma 1, 


E[| y,/J= O06)". 


It follows that the moment generating function satisfies 


Ele] =1+AE[ly,(]+ ) Elly, (1A7¢! 


t1 


$1+AE[ly,[]+ > O()' 07 t!. 


t>1 


Therefore, it converges for any0 <A <A 
lute constant, and 


where i, is an abso- | 


0? 


E[e™ HH] =1+AE[|y,[]+O(A2) = error), 


Using independence, we find that 
E [ery =(E [e*™!q)* = e Elly! p)+O(A7k) 


Meanwhile, Markov’s inequality and (7) imply that 


Pr(lly ,J>(1+e) ELI y|))< Ble /e mm 


< ell) + O(a?) 


as 2 
<e XE on 


for some A = O(é). The constraint A < 2, corresponds to € 
being smaller than some absolute constant. The same argu- | 
ment leads to a similar lower tail estimate. Our choice of 
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| Given any 0<A<A,, for fixed A 


k ensures that, for any x € X, ||®x||, = \\y||, deviates from its 
mean by at most € with probability at least 0.95. By (7), this 


_ implies that kE[|y,|]is itself concentrated around @, =ky2/n 


with a relative error at most €; rescaling € by a constant fac- 
tor and ensuring (3) proves the /, claim of the first part of the 
FJLT lemma. 


| The £, case: We set 


c, log’ n 1 
d ’ ’ 


for a large enough constant c,. 


q=min| 


, cat 1 
LEMMA 2. With probability at least ars 
n 


1. g/2<Z,< 2q foralli=1,...,k; and 
2. kq(l-e)< > Z,<kq(1+e). 
Proof: If g = 1 then Z is the constant g and the claim is 


trivial. Otherwise, g = c,d‘ log? n < 1. For any real A, the 
function 


f,Udi,....u)=E| e | 


_ is convex, hence achieves its maximum at the vertices 


of the polytope P (same as in the proof of Lemma 1). 
As argued before, therefore, E[e”] < Efe]. We conclude 
the proof of the first part with a union bound on stan- 
dard tail estimates on the scaled binomial Z* that we 


derive from bounds on its moment generating function 


E[e’”’] (e.g., Alon and Spencer). For the second part, let 


_S=X* Z, Again, the moment generating function of S is 


bounded above by that of S* ~ m'B(mk, q)—all Z's are 
distributed as Z*—and the desired concentration bound 
follows. * 


We assume from now on that the premise of Lemma 
2 holds for all choices of x € X. A union bound shows 
that this happens with probability of at least 0.95. For each 
i=1,...,kthe random variable y/q/Z, is distributed as y* with 
one degree of freedom. It follows that, conditioned on Z,, 
the expected value of y’is Z,/g and the moment generating 
function of y?is 


Ele” ] = Gazz, ley" ; 


»» for large enough €, the 
moment generating function converges and is equal to 


E [e””"] < eM (Zila) 


We use here the fact that ZJq = O(1), which we derive from 
the first part of Lemma 2. By independence, therefore, 


iets are awk 5 
ple Zn] gia L ila? 


and hence 


k k 
P| at >aroyzia 


i=1 


ae k 
ny Ye (1+e)A Z;/q 
=Pr le ae >e ae 


k 


(8) 


< ee Din Zilqrén? x (a) 


tae plug _ Xin (Z/q) 


~ ELGG) 


into (8) and assume that €is smaller than some global €,, we 
avoid convergence issues (Lemma 2). By that same lemma, 
we now conclude that 


k 
Pr b yp >(1+ on epee", 


i=1 


A similar technique can be used to bound the left tail esti- 
mate. We set k=ce * log n for some large enough c and use a 
union bound, possibly rescaling ¢, to conclude the /, case of 
the first part of the FJLT lemma. 


Running Time: The vector Dx requires O(d) steps, since D is 
diagonal. Computing H(Dx) takes O(d log d) time using the FFT 
for Walsh-Hadamard. Finally, computing P(H Dx) requires 
O(|P|) time, where |P| is the number of nonzeros in P. This | 
number is distributed in B(nk, q). It is now immediate to verify | 
that 

E[|P|]= Oe? “log?**n). 


A Markov bound establishes the desired complexity of the 
FJLT. This concludes our sketch of the proof of the FJLT 
lemma. * 


3. APPLICATIONS 


3.1. Approximate nearest neighbor searching 
Given a metric space (U, d,) and a finite subset (database) 
PU, the problem of €-approximate nearest neighbor (€-ANN) 
searching is to preprocess P so that, given a query x € U, 
a point p € P satisfying 
d,(x,p)<(1+e)d,(x,q), forall qeP, 
can be found efficiently. In other words, we are interested in 
a point p further from x by a factor at most (1 + €) of the dis- 
tance to its nearest neighbor. | 
This problem has received considerable attention. There | 
are two good reasons for this: (i) ANN boasts more applica- 
tions than virtually any other geometric problem”; (ii) allow- 
ing a small error € makes it possible to break the curse of 
dimensionality.2*”” | 
There is abundant literature on (approximate) near- | 
est neighbor searching.® 10, 12, 13, 15, 16, 19, 21-24, 26, 27, 33, 39, 40 The 


early solutions typically suffered from the curse of dimen- 
sionality, but the last decade has witnessed a flurry of new 
algorithms that “break the curse” (see Indyk” for a recent 
survey). 

The first algorithms with query times of poly(d, log 7) 
and polynomial storage (for fixed ¢) were those of Indyk 
and Motwani”! in the Euclidean space case, and Kushilevitz 


| et al.2” in the Hamming cube case. Using JL, Indyk et al. 


provide a query time of O(e?d log n) with n°” storage 
and preprocessing. A discrete variant of JL was used by 


| Kushilevitz et al. in the Hamming cube case. We mention 


here that the dimension reduction overwhelms the running 


| time of the two algorithms. In order to improve the run- 


ning time in both cases, we used two main ideas in Ailon 
and Chazelle.? The first idea applied to the discrete case. It 
used an observation related to the algebraic structure of the 
discrete version of JL used in Kushilevitz et al.”’ to obtain a 
speedup in running time. This observation was only appli- 
cable in the discrete case, but suggested the intuitive idea 
that a faster JL should be possible in Euclidean space as 
well, thereby motivating the search for FJLT. Indeed, by a 
straightforward application in Indyk et al.’s algorithm (with 
p=1), the running time would later be improved using FJLT 
to O(d log d+ €* log’ n). Notice the additive form of this last 
expression in some function f=f(d) and g=g(n, €), instead of 


| amultiplicative one. 


3.2. Fast approximation of large matrices 

Large matrices appear in virtually every corner of science. 
Exact algorithms for decomposing or solving for large 
matrices are often inhibitively expensive to perform. This 
may change given improvements in matrix multiplication 


| technology, but it appears that we will have to rely on matrix 


approximation strategies for a while, at least in the general 
case. It turns out that FJLT and ideas inspired by it play an 
important role in recent developments. 

We elaborate on an example from a recent solution of 
Sarlds*° to the problem of /, regression (least square fit of an 
overdetermined linear system). Prior to that work (and ours), 
Drineas et al.1* showed that, by downsampling (choosing 
only a small subset and discarding the rest) from the set of 
equations of the linear regression, an approximate solution 
to the problem could be obtained by solving the downsam- 
pled problem, the size of which depends only on the dimen- 
sion d of the original solution space. The difficulty with this 
method is that the downsampling distribution depends on 
norms of rows of the left-singular vector matrix of the origi- 
nal system. Computing this matrix is as hard as the original 
regression problem and requires O(m’d) operations, with m 
the number of equations. To make this solution more prac- 
tical, Sarlés observed that multiplying the equation matrix 
on the left by the m x m orthogonal matrix HD (as defined 
above in the definition of FJLT) implicitly multiplies the left- 
singular vectors by HD as well. By an analysis similar to the 
one above, the resulting left-singular matrix can be shown to 
have almost uniform row norm. This allows use of Drineas 
et al.’s ideas with uniform sampling of the equations. Put 
together, these results imply the first o(7m’d) running time 


| solution for worst-case approximate /, regression. 
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In a recent stream of papers, authors Liberty, Martinsson, 
Rokhlin, Tygert and Woolfe** *° ** design and analyze fast 


algorithms for low-dimensional approximation algorithms of | 


matrices, and demonstrate their application to the evaluation 
of the SVD of numerically low-rank matrices. Their schemes 
are based on randomized transformations akin to FJLT. 


4. BEYOND FJLT 

The FJLT result gives rise to the following question: What is 
a lower bound, as a function of n, d and ¢, on the complexity 
of computing a JL-like random linear mapping? By this we 
mean a mapping that distorts pairwise Euclidean distances 
among any set of n points in d dimension by at most 1 + € 
The underlying model of computation can be chosen as a 
linear circuit,” manipulating complex-valued intermedi- 
ates by either adding two or multiplying one by (random) 
constants, and designating 7 as input and k= O(e* log n) as 
output (say, for p = 2). It is worth observing that any lower 
bound in Q(e€* log n min{d, log’ n}) would imply a simi- 
lar lower bound on the complexity of computing a Fourier 
transform. Such bounds are known only in a very restricted 
model*' where constants are of bounded magnitude. 

As a particular case of interest, we note that, whenever 
k= O(d"*), the running time of FJLT is O(d log d). Ina more 
recent paper, Ailon and Liberty’ improved this bound and 
showed thatitis possible to obtain aJL-like random mapping 
in time O(d log d) for k = O(d'?~*) and any 6> 0. Their trans- 
formation borrows the idea of preconditioning a Fourier 
transform with a random diagonal matrix from FJLT, but 
uses it differently and takes advantage of stronger measure 
concentration bounds and tools from error correcting codes 
over fields of characteristic 2. The same authors together 
with Singer consider the following inverse problem*: Design 
randomized linear time computable transformations that 
require the mildest assumptions possible on data to ensure 
successful dimensionality reduction. 
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Want to bea Bug Buster? 


By Shekhar Y. Borkar 


MICROPROCESSOR PERFORMANCE HAS. in- 
creased exponentially, made possible 
by increasing transistor performance 
and doubling the number of transis- 
tors every two years to realize complex 
architectures. These chips with ever 


increasing complexity are not always 


fully functional on the first attempt, 
they need to be debugged quickly, 
bugs fixed, and reworked. We take this 
for granted, but did you ever wonder 
about how it is done? 

Painfully, is the short answer! 


There are two types of bugs: func- 


tional or logic bugs caused by design 
errors, and electrical bugs due to cir- 
cuit marginalities, caused by unfavor- 
able operating conditions such as tem- 
perature changes, and voltage drops. 
Although most of the functional bugs 
are caught during rigorous design val- 


idation and verification, it is virtually | 


impossible to ensure that a design is 
bug-free before tape-out. 

Circuit designers make great ef- 
forts to improve margins to avoid elec- 
trical bugs, but they too are difficult 
to avoid, and especially difficult to re- 
produce since they are manifested by 
various operating conditions. It is ex- 
tremely important to find these bugs 
quickly post-fabrication, and current 
techniques are far too expensive and 
time-consuming. The novel technique 
described in the following paper by 
Sung-Boem Park and Subhasish Mitra 
entitled “Post-Silicon Bug Localiza- 
tion for Processors Using IFRA” pro- 
vides the breakthrough. 

When an error is detected it could 
be caused by one or more such bugs, 
and you need to identify what caused 
the error; root causing the error is not 
that straightforward. The error may be 
caused by encountering a bug during 
an instruction execution, or it may be 
thousands, or even billions, of instruc- 
tion executions before. You must be a 
very good detective to diagnose and 
isolate the bug. 

Post-silicon bug localization and 
isolation is time-consuming and cost- 
ly because you have to reproduce the 
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| failures by returning the hardware | 


to an error-free state, activating the 
failure-causing stimuli, and then re- 
produce the same failures—the most 
difficult task, considering complexi- 
ties such as asynchronous signals and 
multiple clock domains. 


catch bad guys, then one of the first 
steps is to acquire the surveillance vid- 
eo from the scene to get the clues. If the 
| surveillance camera is rolling around 
| the clock, you will likely be required to 
watch the entire tape before coming 
upon the culprit; very tedious indeed. 
Wouldn’t it be nice if the camera 
rolled only when sensing an action 


quire watching every activity recorded 
by the camera and would still be a 
time-consuming and wasteful task. 
Better yet, what if the camera record- 
ed only suspicious activity, captur- 


ing all the necessary circumstantial _ 
evidence, and not necessarily the cul- | 


prits? That would be an optimal solu- 
tion and would make the detective’s 
job a lot easier. This scenario is exactly 
what IFRA does. 

IFRA implements circular buffers, 


capable of recording traces of instruc- _ 


tions executed by the processor, and 
this process is controlled by failure 


detectors. These detectors use failure | 


e detector 
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If you are a detective and want to | 


taking place? Yes, but it would still re- 


detection in the hardware, such as par- 
ity errors as well as soft-triggers that 
suspect an early symptom of a failure. 
These triggers stop further recording 
in the circular buffer, capturing the 
instruction trace of the suspected part 
of the instruction sequence for future 
analysis. 

The traditional method of isolating 
a bug is by comparing the captured 
instruction trace with a golden trace 
captured by a trusted simulator. Sim- 
ulators are notoriously slow and the 
process is very time-consuming. The 
authors of this paper propose a novel 
concept of self-consistency to localize 
the bug by examining the instruction 
trace. For example, if an instruction 
uses a faulty operand then you do not 
need to know the exact value of the op- 


| erand. It is sufficient to know that the 


instruction used a different value than 
the one that was produced for its use. 
The authors describe in detail how to 
diagnose and root cause the bug us- 
ing the captured instruction trace and 
self-consistency. 

Clearly, this is avery novel approach 
to post-silicon debug, and I would 
not be surprised to see it catch on 
quickly; but it’s just a start. This paper 
describes how to debug a micropro- 
cessor, and this technique has great 
potential to go further and help debug 
of multicores, memory systems, ana- 
log circuits, and even complex SOCs. 

Finally, as a critique, you may claim 
that IFRA adds hardware to the chip 


| that is useful only for debugging. Not 


really. Transistors are inexpensive, 
so inexpensive that it is almost like 
incorporating a small logic analyzer 
or a tester on the chip itself to aid in 
debugging, and then turned off; you 
don’t even notice it is there. 

Is there better use for a transistor 
than to help bring products to you 
quickly and inexpensively? 


Shekhar Y. Borkar is an Intel Fellow and Director of 
Microprocessor Research, Intel Corp., Hillsboro, OR. 
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Post-Silicon Bug Localization 
for Processors Using IFRA 


By Sung-Boem Park and Subhasish Mitra 


Abstract 

IFRA, an acronym for Instruction Footprint Recording and 
Analysis, overcomes major challenges associated with a very 
expensive step in post-silicon validation of processors— 


pinpointing a bug location and the instruction sequence | 


that exposes the bug from a system failure, such as a crash. 
Special on-chip recorders, inserted in a processor during 
design, collect instruction footprints—special information 


about flows of instructions, and what the instructions did | 


as they passed through various microarchitectural blocks 
of the processor. The recording is done concurrently dur- 
ing the normal operation of the processor in a post-silicon 


system validation setup. Upon detection of a system failure, | 


the recorded information is scanned out and analyzed off- 
line for bug localization. Special self-consistency-based pro- 
gram analysis techniques, together with the test-program 
binary of the application executed during post-silicon valida- 
tion, are used for this purpose. Major benefits of using IFRA 


Post-silicon validation involves four steps: 


1. Detecting a problem by running a test program, such 
as OS, games, or functional tests, until a system failure 
occurs (e.g., system crash, segmentation fault, or 
exceptions). 

2. Localizing the problem to a small region from the sys- 
tem failure, e.g., a bug in an adder inside an ALU of a 
complex processor. The stimulus that exposes the bug, 
e.g., the particular 10 lines of code from some applica- 
tion, is also important. 

3. Identifying the root cause of the problem. For example, 
an electrical bug may be caused by power-supply noise 
slowing down a circuit path resulting in an error at the 
adder output. 

4. Fixing or bypassing the problem by microcode patching,’ 
circuit editing," or, as a last resort, respinning using a new 
mask. 


over traditional techniques for post-silicon bug localization | 


are (1) it does not require full system-level reproduction of 
bugs, and (2) it does not require full system-level simulation. 
Hence, it can overcome major hurdles that limit the scal- 
ability of traditional post-silicon validation methodologies. 
Simulation results on a complex superscalar processor dem- 
onstrate that IFRA is effective in accurately localizing electri- 
cal bugs with 1% chip-level area impact. 


1. INTRODUCTION 
Post-Silicon validation involves operating one or more 
manufactured chips in actual application environments 
to validate correct behaviors across specified operating 
conditions. According to recent industry reports, 
post-silicon validation is becoming significantly expen- 
sive. Intel reported a headcount ratio of 3:1 for design vs. 
post-silicon validation.'? According to Abramovici et al.,! 


post-silicon validation may consume 35% of average chip | 


development time. Yerramilli*® observes that post-silicon 
validation costs are rising faster than the design costs. 

Loosely speaking, there are two types of bugs that design 
and validation engineers worry about: 


1. Bugs caused by the interactions between the design | 


and the physical effects, also called electrical bugs. 
Such bugs generally manifest themselves only under 


certain operating conditions (temperature, voltage, | 


frequency). Examples include setup and hold time 
problems. 
2. Functional bugs, also called logic bugs, caused by design 
errors. 
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Josephson’ points out that the second step, bug localiza- 
tion, dominates post-silicon validation effort and costs. Two 
major factors that contribute to the high cost of traditional 

| post-silicon bug localization approaches are: 


1. Failure reproduction which involves returning the chip 
to an error-free state, and re-executing the failure- 
causing stimulus (including test-program segment, 
interrupts, and operating conditions) to reproduce the 
same failure. Unfortunately, many electrical bugs are 
hard to reproduce. The difficulty of bug reproduction 
is exacerbated by the presence of asynchronous I/Os 
and multiple clock domains. 

2. System-level simulation for obtaining golden res- 
ponses, i.e., correct signal values for every clock cycle 
for the entire system (i.e., the chip and all the periph- 
eral devices on the board) to compare against the 
signal values produced by the chip being validated. 
Running system-level simulation is typically 7-8 orders 
of magnitude slower than actual silicon. 


Due to these factors, a functional bug typically takes hours 
to days to be localized vs. an electrical bug that requires days 
to weeks and more expensive equipments.” 


A previous version of this paper appeared in the 
Proceedings of the 45th ACM-IEEE Design Automation 
| | Conference (2008, Anaheim, CA). 


Figure 1. Post-silicon bug localization flow using IFRA. 
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IFRA, an acronym for Instruction Footprint Recording 


instructions out of program order, and prediction of branch 
targets and outcomes) that are present in many commercial 
high-performance processors.” Such features significantly 
complicate post-silicon validation. For simpler in-order 
processors (e.g., ARMv6, Intel Atom, SUN Niagra cores), 
IFRA can be significantly simplified. 

There is little consensus about models of functional 


| bugs.* Hence, we focus on electrical bugs that can be mod- 


| eled as bit-flips (more details in Section 4). Extensive IFRA 
_ simulations demonstrate: 


and Analysis, targets bug localization in processors. Figure _ 
1 shows IFRA-based post-silicon bug localization flow. | 


During chip design, a processor is augmented with low- 
cost hardware recorders (Section 2) for recording instruc- 
tion footprints, which are compact pieces of information 
describing the flows ofinstructions (i.e., where each instruc- 
tion was at various points of time), and what the instruc- 
tions did as they passed through various design blocks of 
the processor. During post-silicon bug detection, instruc- 


tion footprints are recorded in each recorder, concurrently | 


with system operation, in a circular fashion to capture the 
last few thousand cycles of history before a failure. 
Upon detection of a system failure, the recorded foot- 


prints are scanned out through a Boundary-scan interface, | 


which is a standard interface present in most chips for test- | 


ing purposes. Since a single run up to a failure is sufficient 
for IFRA to capture the necessary information (details in 
Section 2), failure reproduction is not required for localiza- 
tion purposes. 

The scanned-out footprints, together with the test- 
program binary executed during post-silicon bug detection, 
are post-processed off-line using specialanalysistechniques 
(Section 3) to identify the microarchitectural block with 


the bug, and the instruction sequence that exposes the bug | 


(i.e., the bug exposing stimulus). Microarchitectural block 


boundaries are defined specifically for IFRA. Examples | 


include instruction queue control, scheduler, forwarding 
path, decoders, etc. IFRA post-analysis techniques do 
not require any system-level simulation because they rely 


on checking for self-consistencies in the footprints with | 


respect to the test-program binary. 

Once a bug is localized using IFRA, existing circuit-level 
debug techniques*® can then quickly identify the root cause 
of bugs, resulting in significant gains in productivity, cost, 
and time-to-market. 

In this paper, we demonstrate the effectiveness of IFRA 
for a DEC Alpha 21264-like superscalar processor model® 
because its architectural simulator? and RTL model” 
are publicly available. Such superscalar processors con- 
tain aggressive performance-enhancement features (e.g., 


execution of multiple instructions per cycle, execution of | 


1. For 75% of injected electrical bugs, IFRA pinpointed 
their exact location (1 out of 200 microarchitectural 
blocks) and the time they were injected (1 out of over 
1,000 cycles)—referred to as location-time pair. For 
21% of injected bugs, IFRA correctly identified their 
location-time pairs together with 5 other candidates 
(out of over 200,000 possible pairs) on average. IFRA 
completely missed correct location-time pairs for 
only 4% of injected bugs. 

. The aforementioned results were obtained without rely- 
ing on system-level simulation and failure reproduction. 

. IFRA hardware introduces a very small area impact of 
1% (dominated by on-chip memory for storing 60KB 
of instruction footprints). If on-chip trace buffers’ 
already exist for validation purposes, they can be 
reused to reduce the area impact. Alternatively, a part 
of data cache may also be used to reduce the area 
impact of IFRA. 


Related work on post-silicon validation can be 
broadly classified as formal methods,’ on-chip 
trace buffers for hardware debugging,' off-chip 
program and data tracing,’ clock manipulation,’ 
scan-aided techniques,’ check-pointing with deter- 
ministic replay,?) and online assertion checking." ° 
Table 1 presentsaqualitative comparisonofIFRAvs. existing 
post-silicon bug localization techniques. In Table 1, a 
technique is categorised as being intrusive if it can alter 
the functional/electrical behavior of the system which 
may prevent electrical bugs to get exposed. 

Section 2 describes hardware support for IFRA. Section 3 
describes off-line analysis techniques performed on the 
scanned-out instruction footprints. Section 4 presents sim- 
ulation results, followed by conclusions in Section 5. 


2. IFRA HARDWARE SUPPORT 

The three hardware components of IFRA’s recording infra- 
structure, for a superscalar processor, are indicated as 
shaded parts in Figure 2. 


1. A set of distributed recorders, denoted by ‘R’ in 
Figure 2, with dedicated circular buffers. As an instruc- 
tion passes through a pipeline stage, the recorder 
associated with that stage records information spe- 
cific to that stage (Table 2). When no instruction 
passes through a pipeline stage for many cycles, con- 
secutive idle cycles are compacted into a single entry 
in the corresponding recorder. 
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Table 1. IFRA vs. existing techniques. 


Techniques Formal Trace Scan Clock ; Program and Checkpoint Assertion IFRA 
Methods Buffer Methods Manipulation Data Tracing and Replay | Checking 

Intrusive? (+) No Depends (-) Yes Depends (-) Yes Depends | (+) No 

Failure reproduction? (-) Yes Depends (+) No 

System-level simulation? (+) No (-) Yes (+) No (+) No 

Area impact? (-) Yes | (+) No | (-) Yes (+) No (-) Yes (-)1% 

Applicability? (+) General L (-) Processor Depends (-) Processor 


a ae 
Figure 2. Superscalar processor augmented with recording Table 2. Auxiliary information for each pipeline stage. The 2-bit and 
infrastructure. 3-bit residues are obtained by performing mod-3 and mod-7 operations 
on the original values, respectively. 
I-Cache I-TLB Branch predictor 
Fetch queue TD-assignment| Fetch Auxiliary Information 
unit | 
| | Pipeline Bits per Numberof Entries per 
wane enee cece eeen cence nee > Stage Description Entry Recorders Recorder 
; Decode | Fetch Program counter 32 4 1024 
ees _— LER] 9! Decode Decoded bits 4 4 1024 
y Dispatch 2bitresidue of 6 4 1024 
' Dependency checker ' Paaiieraaime 
Post-trigger| | tes a bo Reg alias table | {Dispatch g 
generator | | r “if Reg free lis ' | | Issue 3-bit residue of 6 4 1024 
as Y “aa operands 
H Issue queue | Issue 
{ : H ALU,MUL 33-bit residue 3 4 1024 
pa a of result 
Execut Branch None 0 2 1024 
” LSU 3-bit residue 35 2 1024 
| of result; Memory 
Reorder buffer Reg alias table |} Commit address 
ca * Sean chain Commit Fatal exceptions 4 i 1 
Key , ; 
—— Instruction flow = Instruction footprint scan out Total storage required for all recorders: Each 
Hi tee aes ae RE © Oa P path after system failure f Am “biti : 
——-» Instruction footprint flow entry contains an additional 8-bit instruction 60KB 
os > Post-trigger control [R] Footprint recorder ID (explained later). 


2. An ID (identification) assignment unit responsible for 
assigning and appending an ID to each instruction 
that enters the processor. 

3. A post-trigger generator, which is a mechanism for 
deciding when to stop recording. 


While an instruction, with an ID appended, flows through 
a pipeline stage, it generates an instruction footprint cor- 
responding to that pipeline stage which is stored in the 
recorder associated with that pipeline stage. An instruction 
footprint corresponding to a pipeline stage consists of 


1. The instruction’s ID that was appended 

2. Auxiliary information (Table 2) that tells us what the 
instruction did in the microarchitectural blocks con- 
tained in that pipeline stage 


Synthesis results (using Synopsys Design Compiler with 
TSMC 0.13 microns library) show that the area impact 
of the IFRA hardware infrastructure is 1% on the Illinois 
Verilog Model™ assuming a 2MB on-chip cache, which is 
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typical of current desktop/server processors. The area cost is 
dominated by the circular buffers present in the recorders. 
Interconnect area cost is relatively low because the wires con- 
necting the recorders (Figure 2) operate at slow speed, and 
a large portion of this routing reuses existing on-chip scan 
chains that are present for manufacturing testing purposes. 


2.1. ID-assignment unit 

For the recorded data to be useful for offline analysis, it is 
necessary to identify which of the trillions of instructions 
that passed through the processor, produced each of the 
recorded footprints. Hence, each footprint in a recorder 
must have an identifier or ID. 

Simplistic ID assignment schemes have limited applica- 
bility. For example, assigning consecutive numbers to each 
incoming instruction, in a circular fashion, using very wide 
IDs is wasteful: using 40-bit IDs will increase the instruc- 
tion footprint total storage to 160KB from 60KB. When IDs 
are too short, e.g., 8-bit IDs if there can be only 256 instruc- 
tions in a processor at any one time, aliasing can occur for 
processors supporting out-of-order execution and pipeline 


flushes (process of discarding instructions in the middle of | 


execution to enforce a change in control flow). There can be 
multiple instructions with the same ID in a processor at any 


given time that may execute out of program order making it | 
| techniques such as parity bits for arrays and residue codes 


very difficult, if not impossible, to distinguish. 

The PC (program counter) value cannot be used as an 
instruction ID for processors supporting out-of-order 
execution, because programs with loops may produce multi- 
ple instances of the same instruction with the same PC value. 
These multiple instances may execute out of program order. 

It is difficult to use time-stamps or other global synchro- 
nization mechanisms as instruction IDs for processors 
supporting multiple clock domains and/or DVFS (dynamic 
voltage and frequency scaling) for power management. 

Our special ID assignment scheme, described below, 
uses log,4n bits, where 7 is the maximum number of instruc- 
tions in a processor at any one time (e.g., 7 = 64 for Alpha 
21264). The first two rules assign consecutive numbers to 
incoming instructions and the third rule allows the scheme 
to work'® under all the aforementioned circumstances: 
i.e., for processors supporting out-of-order execution, pipe- 
line flushes, multiple clock domains and DVEFS. 

Instruction IDs are assigned to individual instructions 


as they exit the fetch stage and enter the decode stage. Since | 


multiple instructions may exit the fetch stage in parallel at 
any given clock cycle, multiple IDs are assigned in parallel. 


Instruction ID Assignment Scheme used by IFRA: 

Rule 1: The first p instructions that exit the fetch stage 
in parallel are assigned IDs, 0,1, 2,..., 9-1. 

Rule 2: Let ID X be the last ID that was assigned. 
If there are q instructions that exit the fetch stage in 
the current cycle in parallel, then q IDs, X + 1 (mod 4n), 
X + 2 (mod 4n), ..., X + g (mod 4n) are assigned to the q 
nstructions. 
Rule 3; If an instruction with ID Y causes a pipeline 
flush, then the ID X in Rule 2 is overwritten with the value 
of Y + 2n (mod 4n). As a result, ID of Y + 2n + 1 (mod 4n) 
is assigned to the first instruction that is fetched after the 
lush. The flush is caused either by a mispredicted branch 
or an exception. 


+ 


2.2. Post-trigger generators 

Suppose that a test program has been executing for billions 
of cycles and an electrical bug is exercised after 5 billion 
cycles from start. Moreover, suppose that the electrical 


bug causes a system crash after another 1 billion cycles (i.e., | 


6 billion cycles from the start). With limited storage, we are 
only interested in capturing the information around the 


time when the electrical bug is exercised. Hence, 5 billions | 


of cycles worth of information before the bug occurrence 
may not be necessary. On the other hand, if we stop record- 
ing only after the system crashes, all the useful recorded 
information will be overwritten. Thus, we must incorporate 
mechanisms, referred to as post-triggers, for reducing error 
detection latency, the length of time between the appearance 
of an error caused by a bug and visible system failure. 


Post-triggers targeting five different failure scenarios 
are listed in Table 2. A hard post-trigger fires when there is 
an evident sign of failure, and causes the processor oper- 
ation to terminate. Classical hardware error detection 


for arithmetic units” as well as in-built exceptions, such 
as unimplemented instruction exceptions and arithmetic 
exceptions, belong to this category. 

However, hard post-triggers mechanisms alone are not 
sufficient, e.g., two tricky scenarios described in the last 
two rows of Table 3. These two failure scenarios may be 


| detected several millions of cycles after an error occurs, 


causing useful recorded information to be overwrit- 
ten even with the existing error detection mechanisms. 
Hence, we introduce the notion of soft post-triggers. 

A soft post-trigger fires when there is an early symptom 
of a possible failure. It causes the recording in all record- 
ers to pause, but allows the processor to keep running. If 
a hard post-trigger for the failure corresponding to the 
symptom occurs within a pre-specified amount of time, 
the processor stops. If a hard post-trigger does not fire 
within the specified time, the recording resumes assum- 
ing that the symptom was false. 

Segmentation fault (or segfault) requires OS handling 
and, hence, may take several millions of cycles to resolve. 
Null-pointer dereference is detected by adding simple 
hardware in the Load/Store unit. For other illegal memory 
accesses, TLB-miss is used as the soft post-trigger. If a seg- 
fault is not declared by the OS while servicing the TLB-miss, 
the recording is resumed on TLB-refill. On the other hand, if 
a segfault is returned, then a hard post-trigger is activated. 


3. POST-ANALYSIS TECHNIQUES 
Once recorder contents are scanned out, footprints belong- 
ing to same instruction (but in multiple recorders) are iden- 
tified and linked together using a technique called footprint 
linking (Section 3.1). The linked footprints are also mapped 
to the corresponding instruction in the test-program binary 
using the program counter value stored in the fetch-stage 
recorder (Table 2). 

As shown in Figure 3, after the footprint linking, four 
high-level post-analysis techniques (Section 3.2) that 
are independent of microarchitecture are run. After which, 


Se eee te SE RE eek EE 
Table 3. Failure scenarios and post-triggers. 


Post-Triggers 


Failure Scenario Hard 


Array error = Parity check 


Arithmetic error = Residue check 


Fatal exceptions = In-built exceptions 


Deadlock Short (2 mem Long (2 secs) 
loads) instruction instruction 
retirement gap retirement gap** 
Segfault TLB-miss + Segfault from OS; 
TLB-refill Address equals 0 
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Figure 3. Post-analysis summary: Park et al!’ describes the exact 
questions asked at each decision node. 


Link footprints 


HLAL HLA2 HLA3 HLA4 


low-level analysis (Section 3.3), represented as a decision 
diagram, asks a series of microarchitecture-specific ques- 
tions until the final bug location-time pair(s) is obtained. 
The bug exposing stimuli are derived from the location- 
time pairs. Currently, the decision diagram is created man- 
ually based on the microarchitecture. Automatic generation 
of such decision diagrams is a topic of future research. 

The post-analysis techniques rely on the concept of 
self-consistency which checks for the existence of contra- 
dictory events in collected footprints with respect to the test- 


program binary. While such checks are extensively used in | 


fault-tolerant computing for error detection™ !° * the key 
difference here is that we use them for bug localization. 
Such application is possible because, unlike fault-tolerant 


computing, the checks are performed off-line enabling | 


more complex analysis for localization purposes. 


3.1. Footprint linking 
Figure 4 shows a part of a test program and the contents of 


three (out of many) recorders right after they are scanned | 


out. As explained in Section 2, since we use short instruction 
IDs (8-bits for Alpha 21264-like processor), we end up having 


multiple footprints having the same ID in the same recorder | 


and/or multiple recorders. For example, in Figure 4, ID 0 
appears in three entries of the fetch-stage recorder, in two 


entries of the issue-stage recorder, and in three entries of | 


the execution-stage recorder. 
Which of these ID Os correspond to the same instruction? 
This question is answered by the following special proper- 


ties enforced by the ID assignment scheme presented in | 


Section 2.1: 


Property 1. All flushed instructions are identified by utilizing 

Rule 3 in our special ID assignment scheme (Section 2.1). 

Property 2. If instruction A was fetched before instruction 

B, and they both have the same ID, then A will always 

exit any pipeline stage (and leave its footprint in the 

corresponding recorder) before B does for that same 
pipeline stage. 
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Figure 4. Instruction footprint linking, with a maximum number of 2 
instructions in flight (i.e., m = 2). 
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In Figure 4, using the first property, footprints correspond- 
ing to flushed instructions are identified and discarded. 
After discarding, using the second property, the youngest ID 
Os across all recorders are linked together, followed by link- 
ing of the second youngest ID 0s, and so on. Since the PC is 
stored in the fetch-stage recorder, we can link the instruction 
ID back to the test program binary to find the corresponding 
instruction. 


3.2. High-level analysis 

IFRA uses four high-level analysis techniques (1) data depen- 
dency analysis, (2) program control-flow analysis, (3) load- 
store analysis, and (4) decoding analysis. 

Each analysis technique is applied separately. We are 
interested in the inconsistency that is closest to the elec- 
trical bug manifestation in terms of time (i.e., the eldest 
inconsistency). Thus, if multiple of them identify inconsis- 
tencies, then the reported inconsistencies are compared to 
see which one occurred the earliest. The high-level analysis 
technique with the earliest occurring inconsistency then 
decides the entry point into the decision diagram for low- 
level analysis. Here we briefly explain the control-flow anal- 
ysis, one of the high-level analysis techniques, to illustrate 
the idea. 

In the program control-flow analysis, four types of ille- 
gal transitions are searched in the PC sequence of the serial 
execution trace (obtained from fetch-stage recorder and test- 
program binary during footprint linking), starting from the 
eldest PC. 


1. The PC does not increment by +4 except in the presence 
of a control flow transition instruction (e.g., branch, 
jump). 

. APC jump does not occur in the presence of uncondi- 
tional transition instruction. 

. The PC does not jump to the correct target in presence 
of direct transition (with target address that does not 
depend on a register value). 

. The PC does not jump to an address that is part of the 
executable address space (determined from the pro- 
gram binary) in the presence of register-indirect tran- 
sition (with target address that depends on a register 
value). 


If any illegal transition is found, the low-level analysis 
scrutinizes the PC register with the instruction that made an 
illegal transition. 


3.3. Low-level analysis 

The low-level analysis involves asking a series of micro- 
architectural-specific questions according to the decision 
diagram. We present a simple example by tracing one of 
the paths in the decision diagram. 

Consider an example where a segfault (Section 2.2) 
during instruction access was detected, and the fourth 
illegal transition of the control-flow analysis was iden- 
tified. We also assume that R5 shown in Figure 5 was 
the register used for the register-indirect transition. 
Instructions B and C have producer-consumer relation- 
ship: B writes its result in to register RO, and C uses a 
value from register RO. 

The first question in the decision diagram is whether 
C consumed the value B produced. The execute-stage 
recorder contains the residues of results and the issue- 
stage recorder contains the residues of operands of 
instructions. Comparing the two values during post-anal- 
ysis shows that they do not match; i.e., B produced a value 
with residue of 5, while C received a value with residue of 
3. This is clearly a problem. 

The second question in the decision diagram is 
whether C and B used the same physical register to pass 
along the value. Analysis of the contents of the dispatch- 
stage recorder, which records the physical register name, 
reveals that B wrote its results into physical register P2, 
while C read its operand value from physical register P5, 
and they are not the same as shown in Figure 6. 

There is again a problem, and the third question in the 
decision diagram asks whether C used a value produced 
by the previous producer (instruction that wrote its result 
into register RO prior to the immediate producer) of regis- 
ter RO. Instruction A in Figure 7 is the previous producer 
of register RO and analysis of the contents of the dispatch- 
stage recorder reveals that indeed that is the case. 

Asking several more questions leads to the bug loca- 


tion and the exposing stimulus shown in Figure 8. The | 


instruction trace between instruction A and instruction 
B is responsible for stimulating the bug, and the trace 


Figure 5. First question in the low-level analysis example: Did C 
consume the value B produced? Answer: No 
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afterwards is responsible for propagating the bug to an 
observation point such as a soft post-trigger. 


4. RESULTS 

We evaluated IFRA by injecting errors into a microarchi- 
tectural simulator? augmented with IFRA. For an Alpha 
21264 configuration (4-way pipeline, 64 maximum 
instructions in-flight, 2 ALUs, 2 multipliers, 2 load/store 


Figure 6. Second question asked in the low-level analysis example: 
Did C and B use the same physical register to pass along the value? 
Answer: No 
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Figure 7. Third question asked in the low-level analysis example: 


Did C and A use the same physical register to pass along the value? 
Answer: Yes 
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Figure 8. Bug location (enclosed in grey area — includes part of the 
decoder responsible for decoding the architectural destination 
register, the write circuitry into a register mapping table, and all the 
pipeline registers in between) shown on the left and the exposing 
stimulus shown on the right. 
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units), there are 200 different microarchitectural blocks 
(excluding array structures and arithmetic units since 
errors inside those structures are immediately detected 
and localized using parity and/or residue codes, as dis- 
cussed in Section 2.2). Each block has an average size 
equivalent of 10K 2-input NAND gates. Seven bench- 
marks from SPECint2000 (bzip2, gcc, gap, gzip, mcf, 
parser, vortex) were chosen as validation test programs 
as they represent a variety of workloads. Each recorder 
was sized to have 1024 entries. 

All bugs were modeled as single bit-flips at flip-flops to 
target hard-to-repeat electrical bugs. This is an effective 
model because electrical bugs eventually manifest them- 
selves as incorrect values arriving at flip-fops for certain 
input combinations and operating conditions." 

Errors were injected in one of 1191 flip-flops [Park and 
Mitra'’]. No errors were injected inside array structures 
since they have built-in parities for error detection. 

Upon error injection, the following scenarios are 
possible: 


1. The error vanishes without any effect at the system 
level or produces an incorrect program output with- 
out any post-trigger firing. This case is related to the 
coverage of validation test programs and post-triggers, 
and is not the focus of this paper. 

2. Failure manifestation with short error latency, where 
recorders successfully capture the history from error 
injection to failure manifestation (including situations 
where recording is stopped/paused upon activation of 
soft post-triggers). 


3. Failure manifestation with long error latency, where | 


1024-entry recorders fail to capture the history from 
error injection to failure (including soft triggers). 


Out of 100,000 error injection runs, 800 of them 


resulted in Cases 2 and 3. Figure 9 presents results from | 


these two cases. The “exactly located” category represents 
the cases in which IFRA returned a single and correct 
location-time pair (as defined in Section 1). The “can- 
didate located” category represents the cases in which 
IFRA returned multiple location-time pairs (called can- 
didates) out of over 200,000 possible pairs (1 out of 200 
microarchitectural blocks and 1 out of 1,000 cycles), and 
at least 1 pair was fully correct in both location and in 
time. The “completely missed” category represents the 


Figure 9. IFRA bug localization summary. 
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cases where none of the returned pairs were correct, even 
if either location or time is correct. In addition, we pes- 
simistically report all errors that resulted in Case 3 as 
“completely missed.” All error injections were performed 
after a million cycles from the beginning of the program 
in order to demonstrate that there is no need to keep 
track of footprints from the beginning. 

It is clear from Figure 9 that a large percentage of bugs 
were uniquely located to correct location-time pair, while 
very few bugs were completely missed, demonstrating the 
effectiveness of IFRA. 


| 5. CONCLUSION 
_ IFRA targets the problem of post-silicon bug localization in 


a system setup, which is a major challenge in processor post- 
silicon design validation. There are two major novelties of 
IFRA: 


1. High-level abstraction for bug localization using 
low-cost hardware recorders that record semantic 
information about instruction data and control flows 
concurrently in a system setup. 

2. Special techniques, based on self-consistency, to ana- 
lyze the recorded data for localization after failure 
detection. 


IFRA overcomes major post-silicon bug localization 
challenges. 


1. It helps bridge a major gap between system-level and 
circuit-level debug. 

2. Failure reproduction is not required. 

3. Self-consistency checks associated with the analysis 
techniques minimize the need for full system-level 
simulation. 


IFRA creates several interesting research directions: 


1. Automated construction of the post-analysis decision 
diagram for a given microarchitecture. 

2. Sensitivity analysis and characterization of the inter- 
relationships between post-analysis techniques, archi- 
tectural features, errordetection mechanisms, recorder 
sizes, and bug types. 

3. Application to homogeneous/heterogeneous multi- 
and many-core systems, and system-on-chips (SoCs) 
consisting of nonprocessor designs. 
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Based on the qualifications and experience, successful candidates can look forward to an excellent remuneration package, and start-up grants to pursue 
research interests in the broad field of Computer Engineering/Computer Science. 


Further information about the school can be obtained at http:/Iwww.ntu.edu.sg/sce. Informal enquiries and submission of application forms can be 
made to SCEHR@ntu.edu.sg. Guidelines for application submission and application forms can be obtained from http:/Iwww.ntu.edu.sg/ohr/Career/ 
SubmitApplications/Pages/default.aspx. 


Closing Date: 145 March 2010 


www.ntu.edu.sg 
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research. Responsibilities include teaching under- | 


graduate computing courses in the Department's 
programs in Computer Information Technology 
and Computer Science; advising and mentoring 
majors; developing curricula in both programs; 
conducting research and engaged scholarship in 
computing fields; serving on college and university 
committees; and serving the computing community 
of educators, researchers, and professionals. Review 
of applications will begin immediately and continue 
until the position is filled. Salary is commensurate 
with qualifications and experience. 

Interested candidates should submit a let- 
ter of application, curriculum vitae, a statement 
about teaching and research philosophy, and 
contact information for three persons from 
whom letters of recommendation can be request- 
ed. Please send information to: 

Computer Science Search Committee 

Department of Computer Science 

TCU Box 298850 

Texas Christian University 

Ft. Worth, TX 76129 


TCU is an EEO/AA employer. Women and mi- 
norities are encouraged to apply. 


Toyota Motor Engineering and 
Manufacturing North America, Inc. 
Senior Research/Principal Scientist & 


execute new, independent research projects for 
robotic and mobile platform applications. 


> Provide high-quality deliverables such as re- 
search/quick prototyping software, written & oral 
reports, IP, publications for peer-reviewed jour- 
nals and conferences. 


> Experience in artificial Intelligence, intelligent 
signal processing & sensor-fusion research. 


> Experience in robotic research and testing. 


> Ph.D. ina related field of study, preferably with 
3 - 8 years of research experience. 


> Strong programming skills (C/C++, Unix). 


Researcher position: 

» Apply special knowledge & talents to address 
outstanding research & development challenges 
in mid & long-term ITS (Intelligent Transportation 
Systems), robotics, and computer vision projects. 


> Write dedicated software, design & carry out ex- 
periments on mobile platforms. 


» Experience in computer vision and perception 
with one or more sensor modalities 


> Good programming skills are essential. 


» M.S. (Electrical/Computer Engineering or Com- 
puter Science) or Ph.D. ina related field of study. 


| rolled students, 48% of w 


University of Campinas (UNICAMP) 
Tenured Professor 


The Faculty of Technology of UNICAMP invites 
applications for one tenured faculty position 


| starting in early 2011. All areas of Computer Sci- 


ence and Engineer will be considered, but prefer- 
ence will be given to candidates with interest in 


| programming language, discrete mathematics, 


computational modelling and graph theory. Suc- 
cessful candidates must have a strong commit- 
ment to academic excellence and teaching, and 
interest to establish and lead an independent and 
creative research group. Start-up resources and 
research infra-structure will be available. 
Applicants should hold a PhD degree in Com- 
puter Science/Engineering or a closely related 
field, have excellent research and teaching record 


| and leadership skills. Screening will begin in Jan- 
| uary, 1st 2010 and will continue until the position 


is filled. To apply please send (PDF only) curricu- 
lum vitae, including publication list, brief state- 
ments of research and teaching to info.ftecnol@ 
reitoria.unicamp.br. 

UNICAMP is considered one of the best re- 
search universities of Brazil. It has 33 thousand en- 
ich are graduate students 
from 135 research programs, ranging from Music 


_ to Molecular Biology. UNICAMP is responsible for 


15% of the research published in Brazil. The uni- 
versity is also responsible for the innovations that 


Research Engineer/Scientist 


Senior Researcher: 
> Apply special knowledge & talents to develop & 


Apply URL: 


http://www.toyota.com/toyota/ 
about/jobs/JobsSearch.do 


led to the development of some major technologies 
_ in Brazil like those in its fiber optics and telecom- 
munication networks and biomedical industries. 
UNICAMP is located in Campinas, a well-known 


ie 


ECOLE POLYTECHNIQUE 
FEDERALE DE LAUSANNE 


EPFL is conducting an international search for the Dean of the School of 
Computer and Communication Sciences, to take office by the fall of 2010. 


EPFL, located in Lausanne (Switzerland), is a leading European University 
and a dynamically growing and well-funded institution fostering excellence 
and diversity. It has a highly international campus at an exceptionally attrac- 
tive location and a first-class infrastructure. As technical university it covers 
computer & communication sciences, engineering, environmental, basic and 
life sciences, management of technology and financial engineering. It offers 
a fertile environment for research cooperation between different disciplines. 


The School of Computer and Communication Sciences, with 42 faculty 
members, has experienced a strong development over the recent years to one 
of the top departments in its area in Europe. The School enrolls about 700 
students in its bachelor and master programs in computer science and com- 
munication systems, has a highly competitive doctoral program with 300 PhD 
candidates recruited world-wide and hosts important industrial and research 
centers, such as the Swiss National Center of Excellence in Research in Mo- 
bile Information and Communication Systems and industry lablets by Nokia, 
SwissCom, and Logitech. 


The Dean bears the overall responsibility for the school in matters of edu- 
cation, research, finance and organization and reports to the President of 
EPFL. The position offers competitive compensation and tenure at full 


of Computer and Communication Sciences 
at Ecole polytechnique fédérale de Lausanne (EPFL) 


Dean 


professor level. Candidates should have an outstanding academic record, a 
strong vision for the development of the faculty in research, teaching, and 
technology transfer, proficiency in recruiting, and exceptional leadership, 
communication and management skills. EPFL will provide the means to re- 
alize a strategic development of the school over the coming years with the 
objective to establish world-class leadership in education and research. 


The School of Computer and Communication Sciences Dean Search Com- 
mittee invites letters of nomination, applications (vision statement, com- 
plete CV, and the name of up to 5 professional references), or expressions 
of interest. The screening of applications will start on March Ist, 2010. 
Materials and inquiries should be addressed, preferably electronically 
(PDF format) to: 


Prof. Karl Aberer 
Chairman of the Search Committee 
e-mail: karl.aberer@epfl.ch 


More information on EPFL and the School of Computer and Communica- 
tion Sciences can be found at http://www.epfl.ch and http://ic.epfl.ch 
respectively. 


EPFL is committed to balance genders within its faculty, and strongly 
encourages women to apply. 
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CAREERS 


technology town of Brazil, in the State of Sado Paulo, 
which accounts for 40% of Brazil’s GNP. 


University of Massachusetts Amherst 
Department of Computer Science 
Faculty Positions in Computer Science 


The University of Massachusetts Amherst invites ap- 
plications for two tenure-track faculty positions at | 
the assistant professor level. Applicants must have a 
Ph.D. in Computer Science or related area and should 
show evidence of exceptional research promise. 

For the first position, we seek Computer Sci- 
ence candidates able to collaborate with the de- 
partments of Linguistics and Psychology. The 
applicant should have a strong background in 
Natural Language Processing, preferably in the 
area of Syntax and Semantics. The applicant 
should also show a strong record of publication 
in Natural Language Processing and other areas 
that relate to Linguistics and Psychology, such as 
Machine Learning, Data Mining, and Information 
Retrieval. A history of interdepartmental collabo- 
ration is desirable but not required. (R36783) 

For the second position, we seek Computer 
Science candidates that can collaborate with the 
departments of Sociology, Political Science, and 
Mathematics & Statistics. We seek a candidate in 
the area of Computational Social Science, or re- 
lated areas of applied and theoretical computer 
science that could be relevant to the social scienc- 
es, such as social network analysis. (R36785) 

The department is committed to the development 
of a diverse faculty and student body, and is very sup- 


| portive of junior faculty, providing both formal and 


informal mentoring. We have a strong record of NSF 
CAREER awards and other early research funding. 
We lead the NSF-funded Commonwealth Alliance 


for Information Technology Education (CAITE) to 


design and carry out comprehensive programs that 
address underrepresentation in information tech- 
nology (IT) education and the workforce. 

The Department of Computer Science has 40 
tenure and research track faculty and 180 Ph.D. stu- 
dents with broad interdisciplinary research interests. 
The department offers first-class research facilities. 
Please see http://www.cs.umass.edu for more infor- 
mation. The University provides an intellectual en- 
vironment committed to providing academic excel- 
lence and diversity including mentoring programs 
for faculty. The College and the Department are com- 
mitted to increasing the diversity of the faculty, stu- 
dent body and the curriculum. To apply, please send 
a cover letter referencing search R36783 or R36785 
with your vita, a research statement, a teaching state- 
ment and at least three letters of recommendation. 

We also invite applications for Research Fac- 
ulty (R36769) and Research Scientist, Postdoc- 
toral Research Associate, and Research Fellow 
(R36768) positions in all areas of Computer Sci- 
ence. Applicants should have a Ph.D. in Computer 
Science or related area (or an M.S. plus equivalent 
experience), and should show evidence of ex- 
ceptional research promise. These positions are 
grant-funded; appointments will be contingent 
upon continued funding. To apply, please send 
a cover letter with your vita, a research statement 
and at least three letters of recommendation. 

Electronic submission of application materi- 


24 


Faculty Positions 


Department of Software Design and Management 


Kyungwon University 


The Department of Software Design and Management at Kyungwon 
University in South Korea invites applications for a tenure-track 
position at the assistant professor or associate professor level. 
Kyungwon University is located at Seongnam near Seoul. Further 
information about the university can be obtained at http://www. 


kyungwon.ac.kr/english. 


The Department of Software Design and Management is a new 


SICHSti 


KYUNGWON UNIVERSITY 


als is recommended. Application materials may 
be submitted in pdf format to facrec@cs.umass. 
edu. Hard copies of the application materials may 
be sent to: Search {fill in number from above}, 
c/o Chair of Faculty Recruiting, Department of 
Computer Science, University of Massachusetts, 
Amherst, MA 01003. 

We will begin to review applications on Janu- 
ary 4, 2010 and will continue until available posi- 
tions are filled. Salary and rank commensurate 
with education and experience; comprehensive 
benefits package. Positions to be filled dependent 
upon funding. Inquiries and requests for more in- 
formation can be sent to: facrec@cs.umass.edu 

The University of Massachusetts is an Affirma- 
tive Action/Equal Opportunity employer. Women 
and members of minority groups are encouraged 
to apply. 


University of Toronto 
Assistant Professor - Computer Science 


The Department of Computer and Mathemati- 
cal Sciences, University of Toronto Scarborough 
(UTSC), and the Graduate Department of Com- 
puter Science, University of Toronto, invite appli- 
cations for a tenure-stream appointment at the 
rank of Assistant Professor, to begin July 1, 2010. 
We are interested in candidates with research 
expertise in Computer Systems, including Operat- 
ing Systems, Networks, Distributed Systems, Da- 
tabase Systems, Computer Architecture, Program- 
ming Languages, and Software Engineering. 
Candidates should have, or be about to re- 


The Faculty of Engineering of the 
University of Freiburg, with its 
Departments of Computer Science 
and Microsystems Engineering, invites 
applications for the position of a 


Full Professor (W3) of 
Computer Science 


The successful candidate will be expected to establish a 
comprehensive research and teaching program in the area 
of algorithms and complexity with specialisation either in 
cryptography, optimization or parallel computing. 


department within the IT College. It is to launch in March 2010. It 
aims to become one of the world’s top institutes for undergraduate 
software education. Dr. Won Kim, a world-renown pioneer in object- 
oriented and object-relational database technology, has joined 
Kyungwon University as IT Vice President and a lifetime professor to 
create, launch and grow the Department. 
The Department seeks qualified candidates in one or more of the 

following areas: 

© software engineering and architecture 

e intelligent multimedia processing 

* computer networking and communication 


Applicant should have a strong passion for teaching, a Ph.D. degree 
in computer science from a reputable U.S. university, experience in 
teaching undergraduate computer science courses, and strong 
publication records. 

How to Apply: Send a resume and cover letter, along with three 
letters of reference to affairs@kyungwon.ac.kr. 
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The University of Freiburg aims to increase the representation 
of women in research and teaching, and therefore expressly 
encourages women with suitable qualifications to apply for 
the post. 


Information about the Department of Computer Science can 
be obtained from www.informatik.uni-freiburg.de. 


Applications, including a curriculum vitae, publications list 
and statement of research interests should be sent by March 
8, 2010 to the Dean of the Faculty of Engineering, University 
of Freiburg, Georges-Koehler-Allee 101, 79110 Freiburg, 
Germany (www.tf.uni-freiburg.de). Please send an electronic 
version of your application to the Dean’s office and ask for 
our application form (dekanat@tf.uni-freiburg.de). 


ceive, a Ph.D. in computer science or a related 
field. They must demonstrate an ability to pur- 
sue innovative research at the highest level, and 
a commitment to undergraduate and graduate 
teaching. Evidence of excellence in teaching and 
research is necessary. Salary will be commensu- 
rate with qualifications and experience. 

The University of Toronto is an international 
leader in computer science research and educa- 
tion, and the Department of Computer and Math- 
ematical Sciences enjoys strong ties to other units 
within the University. The successful candidate for 
this position will be expected to participate actively 
in the Graduate Department of Computer Science 
at the University of Toronto, as well as to contribute 
to the enrichment of computer science academic 
programs at the University’s Scarborough campus. 

Application materials, including curriculum 
vitae, research statement, teaching statement, and 
three to five letters of recommendation, should be 
submitted online at www.mathjobs.org, preferably 
well before our deadline of January 17, 2010. 

PLEASE NOTE THAT WE ARE ONLY ACCEPT- 
ING APPLICATIONS AT: www.mathjob.org 

For more information about the Department of 
Computer & Mathematical Sciences (@ UTSC, please 
visit our home page www.utsc.utoronto.ca/~ csms. 


University of Toronto 
Lecturer - Computer Science 


The Department of Computer and Mathemati- 
cal Sciences, University of Toronto Scarborough 
(UTSC), invites applications for a full-time posi- 
tion in Computer Science at the rank of Lecturer, 
to begin July 1, 2010. 

We are especially interested in candidates 
who will help advance our curriculum in the areas 
of computer systems, computer architecture, and 
software engineering. 

Appointments at the rank of Lecturer may 
be renewed annually to a maximum of five years. 
In the fifth year of service, Lecturers shall be re- 
viewed and a recommendation made with respect 
to promotion to the rank of Senior Lecturer. 

Responsibilities include lecturing, conduct- 
ing tutorials, grading, and curriculum develop- 
ment ina variety of undergraduate courses. 

Candidates should have a post-graduate degree, 
preferably a PhD, in Computer Science or a related 
field, and must demonstrate potential for excel- 
lence in teaching at the undergraduate level. 

Salary will be commensurate with qualifica- 
tions and experience. 

Application materials, including curriculum 


vitae, a statement of career goals and teaching | 


philosophy, evidence of teaching excellence, and 
a minimum of three reference letters should be 
submitted online at: www.mathjobs.org, prefer- 
ably well before our deadline of March 1, 2010. 

PLEASE NOTE THAT WE ARE ONLY ACCEPT- 
ING APPLICATIONS AT: www.mathjobs.org 

For more information about the Department of 
Computer & Mathematical Sciences @ UTSC, please 
visit our home at: www.utsc.utoronto.ca/~csms 


University of Toronto 
Mendelzon Visiting Assistant Professor 


The Department of Computer and Mathemati- 
cal Sciences, University of Toronto Scarborough 


invites applications for a non-tenure-stream, two- 
year appointment as the Mendelzon Visiting As- 
sistant Professor, to begin July 1, 2010. 

We will consider applicants in all areas of 
computer science, but are especially interested in 
applicants who will help advance our curriculum 
in computer systems and software engineering. 

The University of Toronto is an international | 
leader in computer science research and educa- 
tion, and the Department of Computer and Math- 
ematical Sciences enjoys strong ties to other units 
within the University. 

The successful candidate for this position 
will be encouraged to engage in collaborative re- | 
search with other computer science faculty at the 
university, as well as to contribute to the enrich- 
ment of computer science academic programs at 
the University’s Scarborough campus. 

Candidates should have, or be about to re- 
ceive, a Ph.D. in computer science or a related | 
field. They must demonstrate an ability to pursue | 
innovative research, and acommitment to under- | 
graduate teaching. 

Application materials, including curriculum 
vitae, research statement, teaching statement, and 
three to five letters of recommendation, should be 
submitted online at www.mathjobs.org, preferably 


| well before our deadline of January 17, 2010. 


The University of Toronto is strongly commit- 
ted to diversity within its community and espe- 
cially welcomes applications from visible minority 
group members, women, Aboriginal persons, per- 
sons with disabilities, members of sexual minor- 
ity groups, and others who may contribute to the 
further diversification of ideas. All qualified candi- 
dates are encouraged to apply; however, Canadians 
and permanent residents will be given priority. 

The Mendelzon Visiting Assistant Professor- 
ship is a position created in memory of Alberto 
Mendelzon, FRSC, distinguished computer sci- 
entist, and former chair of the Department of 
Computer and Mathematical Sciences, University 
of Toronto Scarborough. 

PLEASE NOTE THAT WE ARE ONLY ACCEPT- 
ING APPLICATIONS AT: www.mathjobs.org 

For more information about the Department of 
Computer & Mathematical Sciences @ UTSC, please 


| visit our home page: www.utsc.utoronto.ca/~ csms 


ADVERTISING IN 
CAREER OPPORTUNITIES 


How to Submit a Classified Line Ad: Send an e-mail 
to acmmediasales@acm.org. Please include text, 
and indicate the issue/or issues where the ad will 
appear, and a contact name and number. 
Estimates: An insertion order will then be e-mailed 
back to you. The ad will by typeset according to 
CACM guidelines. NO PROOFS can be sent. Classified 
line ads are NOT commissionable. 
Rates: $325.00 for six lines of text, 40 characters 
per line. $32.50 for each additional line after the 
first six. The MINIMUM is six lines. 
Deadlines: Five weeks prior to the publication date 
of the issue (which is the first of every month). 
Latest deadlines: http://www.acm.org/publications 
Career Opportunities Online: Classified and 
recruitment display ads receive a free duplicate 
listing on our website at: http://campus.acm.org/ 
careercenter | 
Ads are listed for a period of 30 days. | 
For More Information Contact: 
ACM Media Sales 
at 212-626-0686 or 
acmmediasales@acm.org 


ACM 
Transactions on 
Reconfigurable 
Technology and 

Systems 


Reconfigurable Technology 
and Systems 


‘erthica Anaya anc Process Vanston-wasee Moxting an 
‘Skew Asipement for FPGAS 


A Desktop Computer with 4 Recottigesube Preaien” 


eee? 
This quarterly publication is a peer- 
reviewed and archival journal that 
covers reconfigurable technology, 
systems, and applications on recon- 
figurable computers. Topics include 
all levels of reconfigurable system 
abstractions and all aspects of recon- 
figurable technology including plat- 
forms, programming environments 
and application successes. 

eee 
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last byte 


DOI:10.1145/1646353.1646380 Peter Winkler 


Brea ing Chocolate Bars 


Welcome to three new puzzles. Solutions to the first two will be published 


next month; the third is (as yet) unsolved. In each, the issue is how your intuition 
matches up with the mathematics. 


Figure 1. Chartie’s first bar at the begin 
after four breaks, and after 12 breaks. 


Readers are encouraged to submit prospective puzzles for future columns to puzzled@cacm.acm.org. 


Peter Winkler (puzzled@cacm acm.org) is Professor of Mathematics and of Computer Science and Albert 
Bradley Third Century Professor in the Sciences at Dartmouth College, Hanover, NH. 
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CONNECT WITH OUR 
COMMUNITY OF EXPERTS. 


www.reviews.com 
: Association for 
& poeple They'll help you find the best new books 
Reviews.com and articles in computing. 


Computing Reviews is a collaboration between the ACM and Reviews.com. 


25TH ANNUAL ACM 
CONFERENCE ON 


SYSTEMS, 
PROGRAMMING, 
LANGUAGES, 
APPLICATIONS: 
SOFTWARE FOR 
HUMANITY 


MARCH 25, 2010 

Submission deadline for 

OOPSLA Research Papers, 
Practitioner Reports, 

Educators’ and Trainers’ Symposium, 
and proposals for Tutorials, 


Workshops, Panels 


APRIL 23, 2010 
Onward! Papers and Essays 


JUNE 24, 2010 

Submission deadline for Posters, 
Demonstrations, Doctoral Symposium, 
Onward! Films, and Student Research 
Competition and Volunteers 


LOCATION 
John Ascuaga's Nugget Hotel 
Reno/ Tahoe Nevada USA 


COLOCATED CONFERENCES 
Onward! 

Dynamic Language Symposium (DLS) 
Pattern Languages of Programs (PLoP) 


and more 


CONFERENCE CHAIR 
William R. Cook, UT Austin 
chair@splashcon.org 


OOPSLA PROGRAM CHAIR 
Martin Rinard, MIT 
program@splashcon.org 


For information, please contact 
ACM Member Services Department 
1-800-342-6626 (US & Canada) 


+1-212-626-O500 (global) 
info@splashcon.org ic} 


SPLASH/OOPSLA is sponsored by 
ACM SIGPLAN and SIGSOFT 


WWW.SPLASHCON.ORG 


